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Abstract 


Climate change and impacts studies are gaining importance in wake of 
changing climate and its impact. Climate models namely General Circula¬ 
tion Models have been developed by different research groups to study the 
impact of climate at Global Scale and they are the primary dataset available 
for modelling global climate change in the future. However, owing to their 
coarse spatial resolution, GCM models are not appropriate for impact studies 
at local scale having the finer spatial resolution. Therefore, for impact studies, 
climate models available at global scale are correlated with atmospheric and 
climate conditions like temperature and precipitation at local scale through 
downscaling process. Different downscaling techniques ranging from simple to 
dynamic downscaling techniques have been developed by the researchers to 
develop the mathematical models that correlate the GCM outputs with local 
observations. 

Among these downscaling techniques, statistical downscaling techniques 
are most widely used techniques owing to easy of its implementation through 
computer based tools. SDSM is one of the widely used software for statistical 
downscaling that utilizes statistical downscaling technique for downscaling the 
GCM data-set. However, the available statistical downscaling software tools 
are not appropriate to automate the downscaling process for multiple grids of 
a given area of interest (AOI). Using the existing downscaling tools, manual 
intervention is required to downscale the GCM data at local scale for large 
AOls having the sizeable spatial extent. 

In this research work, a novel generalized downscaling model namely Effi¬ 
cient Multi-site Statistical Downscaling Model (EMSDM) based on the multi¬ 
variate regression technique has been developed to automate the downscaling 
process for multiple grids. EMSDM can be applied to automate the down- 
scaling of GCM data to multiple local grids of a AOI. Internal procedures 
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of EMSDM are programmed in platform independent C programming lan¬ 
guage for efficiently handling large quantum of GCM and local observation 
data and carrying out the complex mathematical computations like inversion 
of large matrices. For demonstrating, the applicability of the model, GCM 
model namely second generation Canadian Earth System Model (CanESM2) 
(CanESM2) developed by the Canadian Centre for Climate Modelling and 
Analysis (CCCma) of Environment and Climate Change Canada and local 
daily precipitation and temperature data-set acquired from Indian Meteoro¬ 
logical Department (IMD) have been used for carrying out downscaling using 
the proposed model. India has been selected as AOI. 

On basis of analysis of downscaling results generated by the model, it can be 
concluded that proposed model can efficiently be used to carry out statistical 
downscaling the AOI (comprising of multiple grids) irrespective of its extent. 
Results generated by the proposed model can be utilized by investigators to 
carry out climate impacts studies for AOI having large spatial extent. 

Moreover, in order to facilitate the spatial geo-visualization of downscal¬ 
ing results, a web GIS based framework has been developed to geo-visualize 
the time series data generated by EMSDM. In addition of the downscaling, 
EMSDM is able to generate valuable spatial data-set pertaining to local ob¬ 
servation and GCM outputs of given area of interest. These spatial date-set 
can utilized by the decision makers to investigate spatial distribution of clima¬ 
tological parameters like temperature, precipitation etc. 
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Chapter 1 

Introduction 

1.1 General 

India is facing a dismal situation. Out of the twenty major river basins, four¬ 
teen are being considered over stressed due to the population explosion, over- 
exploitation of resources and other activities which in turn resulting in nega¬ 
tive impact of climate. The negative impact of climate change has made the 
problem intricate. Water availability has decreased from 1816 cubic meters 
per capita to 1545 cubic meters in a span of ten years from 2001 to 2011. 
The temperature of the Indian subcontinent is expected to rise by 2.5°C to 
4.5°C by 2100. Moreover, according to the Intergovernmental Panel on Cli¬ 
mate Change (IPCC) Technical Report on Climate Change and water changes 
in the large-scale hydrological cycle have been related to an increase in the ob¬ 
served temperature and resulting in irregular precipitation pattern over num¬ 
ber of decades. Henceforth, the overall net impact of climate change on water 
resources is negative. 

In order to investigate the climate change of a region, climate models have 
been formulated and prediction has been carried out using models by apply¬ 
ing the downscaling process and developing the trends for climate variables 
like temperature, precipitation etc. for future years or decades. Downscaling 
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is a technique that correlates large scale climatic variables like atmospheric 
pressure and geopotential height, to local surface variables, for example, tem¬ 
perature and precipitation. Although existing research work on climate and 
hydrology provides the methodology to carry out downscaling of climate data 
at finer scales, framework for automating the downscaling process of given 
area of interest (AOI) irrespective of its spatial extent is not available. While 
considering this fact, the objective of this thesis is to develop computational 
framework for downscaling of climate data for given AOI irrespective its spa¬ 
tial extent. In this chapter, a general introduction provides an overview on 
need, objectives, scope, research contribution and organization of the thesis. 

1.2 Need of Research 

Due to non-availability of proper computational framework for downscaling 
the global climate model at local level for large areas, aggregate climate pre¬ 
diction for large regions like country is an integrate task. More specifically, due 
to non-availability of automation framework, analyst requires to carry out the 
downscaling of climate data for large number of sub-regions of AOI manually. 
Hence, in this Ph.D. research work, a computational framework for automat¬ 
ing the grid wise downscaling for AOI like a country or state of a country has 
been proposed. Proposed computational framework is implemented as a set of 
software modules that are combinedly is given name as Efficient Multi-site Sta¬ 
tistical Downscaling Model (EMSDM). For subsequent discussion, developed 
framework is abbreviated as EMSDM. 

1.3 Objectives 

Following are the objectives of the research work: 

1. Selection of suitable downscaling technique for downscaling. 
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2. Development of relevant data structure for GCM and local data-set as 
required for downscaling. 

3. Generation of spatial database for grid wise downscaling for a selected 
study area. 

4. Development of mathematical model for statistical downscaling. 

5. Development and implementation of a statistical downscaling model. 

1.4 Scope of Research 

Present research work is focused towards the development of a computational 
framework for downscaling the climate data at regional scale viz. country or a 
state. Owing to ease of its implementation using computer programs, statis¬ 
tical downscaling technique using regression techniques have been adopted for 
downscaling. Statistical downscaling is a type of downscaling technique that 
correlates large scale climatic variables of Global Climate Model (GCM), to 
local climate variables. This relationship is developed using a statistical model 
which can then be used to generate future data for climate prediction [Wilby 
et ah, 2004], GCM viz. CanESM2 and precipitation data acquired from India 
Meteorological Department (IMD) are used for application of the EMSDM. 

1.5 Research Contribution 

Following are the two major contributions of the research work: 

1. This contribution endeavours to provide design guidelines pertaining to 
organization and management of climate data viz. precipitation, temper¬ 
ature used for downscaling and subsequent analysis. In brevity, following 
representation reinforces the contribution meaning: 

Data : Organization =»■ Management 

2. Development of computational framework for analysis and information 
generation. This contribution provides the “value addition” to acquired 
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data for information dissemination to the analyst i.e. how data can 
be transformed into useful climate information for given AOl under the 
purview of proposed framework. In brevity, following representation re¬ 
inforces the contribution meaning: 

Information =>■ Data + Value Added 


1.6 Organization of the Thesis 

In Chapter 1, a general overview, need, objectives, scope, research contribution 
and organization of the research are presented. In Chapter 2, the preliminaries 
(concepts) like Climate, Weather, Atmosphere etc. relevant to present research 
have been discussed in brief. These preliminaries form the basis for discussions 
in subsequent chapters. In Chapter 3, review of downscaling approaches and 
their types with major focus on statistical downscaling has been carried out. 
Review of existing research works pertaining to statistical downscaling has also 
been carried out. Chapter 4 discusses the mathematical background related to 
major concepts including correlation, multiple linear regression, multi-variate 
regression and model solution for developing the mathematical model based on 
regression techniques. In Chapter 5, "Efficient Multi-site Statistical Downscal¬ 
ing Model (EMSDM)" and its underlying steps are discussed in detail. Chapter 
6 discusses the implementation of EMSDM in detail. This chapter discusses 
the pre-processing of the data-sets and necessary algorithms for implementing 
the steps discussed in Chapter 5. In Chapter 7, application of EMSDM for In¬ 
dia using CanESM2 and IMD data-sets has been presented to demonstrate its 
applicability for the specified area of interest. Finally, in Chapter 8, the major 
conclusions drawn from the research and recommendations as an overview of 
further study and implementation related to the current research have been 
discussed. 
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Chapter 2 


Preliminaries 

2.1 General 

Research on climate research draws its guidelines for various aspects. In this 
chapter, key underlying terminologies pertaining to climate research are dis¬ 
cussed in brief. These preliminary terms form the basis for the discussion 
carried out in subsequent chapters. 

2.2 Terminologies 

2.2.1 Weather 

Weather is the atmospheric phenomena caused by the transfer or movement 
of the energy. Transfer of energy in the atmosphere occurs via movement of 
air; which causes the major weather phenomena [Allaby, 2009; Pont, 2014], 
Association between time, place and atmosphere signifies the Weather. In 
other words the condition of the Atmosphere at a particular time and place is 
known as Weather of that place. It can be defined in brevity as: 

"Weather is a Spatio-temporal phenomenon of Atmosphere." 

Weather is associated with the physical conditions in the atmosphere (hu¬ 
midity, temperature, air pressure, wind, and precipitation) that exists over 
short time scales, generally days or weeks. In general, major weather patterns 
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are the consequence of rotation of the Earth and non-uniform heating of the 
atmosphere due to its exposure to solar radiation [Saha, 2008]. Weathers pat¬ 
terns result in development of high and low pressures at a geo-location on the 
Earth. Major factors which are studied as the components of the weather of a 
particular location on Earth are [Bychkov et ah, 2010; Yigit, 2015]: 

1. Temperature 

2. Humidity 

3. Precipitation 

4. Cloudiness 

5. Visibility 

6. Wind 

7. Atmospheric Pressure 

8. Solar radiation 

2.2.2 Climate 

Climate signifies the general weather condition that exists for longer period of 
time. Therefore, climate is the average weather condition that is quantihed for 
the given set of decade(s) or centuries. Climate can be defined as 

" Climate is statistics of Spatio-temporal phenomenon of Atmosphere. " 

While weather signifies the short term atmospheric condition in time and 
space, climate signifies the long term atmospheric condition. Mark Twin curtly 
differentiates the climate and the weather as: 

“ Climate is what one expect and weather is what one gets. " 
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Moreover Climate describes the trends of the weather and helps in predict¬ 
ing the weather condition of the future. Climate of a particular geo-location is 
shaped by factors like glacier, mountains, ocean currents, planetary motions, 
winds etc. [Allaby, 2009]. 

Statistically, the climate of a place is defined as a weather statistics that 
prevails for the long-term. In short, climate is "statistics of weather". Quan¬ 
titatively measurement of Climate can be carried out by determining the long 
term statistics of various weather parameters like temperature and rainfall; 
which are also referred as climate elements. Climate of a particular place 
also affected by extreme weather events. These extreme weather events are 
not treated as anomaly but used to determine better future climate change 
projections. 

A Climate study can be carried on different geographical scales. Due to the 
diverse topography of the Earth’s surface, places of tens of kilometers or just a 
few kilometers across can have its own climate influence; which can be termed 
as local climate of that place. For example, an urban area has its own climate 
due to prevalence of urban heat island effect; which is easily differentiable 
from the surrounding area in context of climate. In brief, it can be stated 
that the urban heating is a local climatic phenomenon. For large geographical 
extent like individual state or individual country, climate is termed as regional 
climate. Regional climate represents weather statistics or pattern of particular 
region i.e. a state or a country. Moreover, global climate statistics can be 
derived from the regional climate statistics. 

Different geographic location on the Earth is classified into different climate 
zones. Climate classification given by famous Russian climatologist Wladimir 
Koppen’s climate classification remains the most commonly used system rec¬ 
ognized by geographical and climatological societies across the world. Kop- 
pen has classified climate zones as desert, polar, temperature, tropical, and 
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subtropical climates [Allaby, 2009]. Climate zones exhibit variable changing 
patterns of precipitation and temperature. Holistically, climate researchers 
investigate the global climate. 

2.2.2.1 Key determining factors of the climate 

Table 2.1 enumerates the factors that signify the climate of the area or region 
Table 2.1: Factors affecting the Climate 


S.No. 

Factors 

Remark 

1 

Altitude 

Higher altitude areas are colder than the low 

altitude areas. 

2 

Latitude 

Areas near the equator are warmer. 

3 

Humidity 

Areas near the water bodies like river or ocean 

have high humidity in comparison to desert 

areas. 

4 

Topography 

Mountain often block the circulation of the air 

henceforth acts as a barrier for air effecting 

the weather. 


As depicted in Table 2.1, there are different factors that affect the climate 
of a given area. The most important factor is latitude. Owing of the ellipsoidal 
shape of the Earth along with its flattening at poles, locations which are in 
close proximity of the equator are more exposed to the solar radiations in 
comparison to the locations that are nearer to the North and South Poles. This 
temperature dissimilarity between the locations controls the atmosphere and 
shifting the heat far apart from the equator in the direction of the poles. This 
general circulation of the atmosphere is divided into the different circulation 




cells. These cell signifies disparate pressure zones and wind belts [Allaby, 
2009], 

Other components of climate comprises of land-sea distribution, mountains 
and oceans. Ocean currents have a profound influence on regional and global 
climate. The Gulf Stream in the Atlantic affects the weather of Northwest 
Europe. The periodic El Nino current in the equatorial Pacific can have neg¬ 
ative effects on the weather of some areas of South America and Australasia. 
Coastal regions usually exposed to mild and humid maritime climates, while 
the internal regions of the Earth have more continental climates that experi¬ 
ence exterme winters and hot summers. Mountains influence regional climate 
as well as local climates and moreover disrupts the circulations of winds [Raf¬ 
ferty, 2011; Shrestha et ah, 2014], 

2.2.3 Atmosphere 

Layers of gases surrounding the Earth are collectively termed as Earth’s at¬ 
mosphere. The atmosphere just looks like one vast blanket of gases which 
surrounds the Earth. These layers of gases are held up together by Earth’s 
gravitational forces. The atmosphere keeps the Earth warm with the help of 
solar radiations for sustainability of the life-form on the Earth. Further, it 
protects the life-form from harmful solar radiations. For water cycle on planet 
Earth’s Atmosphere is a major component. Most importantly, it provides 
the breathing air for life-form to live. The major composition of the Earth’s 
Atmosphere is given in Table 2.2[Alexander, 2012; Allaby, 2009]: 

As observed from the Table 2.2, that Nitrogen and Oxygen makes up 99% 
of the atmosphere. The composition of the Greenhouse Gases (GHG) in the at¬ 
mosphere is less than 0.05%; out of this, 0.04% of the total gaseous component 
is Carbon Dioxide (C02). 

Generally the atmosphere is classified into five layers, each layer has its own 


9 



Table 2.2: Earth's Atmosphere gaseous components percentage 


Gas 

Volume in % 

Name 

Formula 

Nitrogen 

n 2 

78 

Oxygen 

o 2 

21 

Argon 

Ar 

0.9 

Other Gases Combined 

0.1 


characteristics and properties. Following are the atmospheric layers [Bychkov 
et al., 2010; Saha, 2008]:, 

1. Troposphere 

2. Stratosphere 

3. Mesosphere 

4. Thermosphere 

5. Exosphere 

Another classification of atmosphere in seven layers is given below [Bychkov 
et ah, 2010]: 

1. Troposphere 

2. Stratosphere 

3. Mesosphere 

4. Chemosphere 

5. Thermosphere 

6. Ionosphere 

7. Exosphere 

Troposphere and Stratosphere are the lowest layers in both of the classihca- 
tions. Troposphere and the lower part of Stratosphere where the ozone resides 
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are the part of atmosphere responsible for the weather; remaining parts of the 
atmosphere is does not affect the weather. Troposphere is the densest layer of 
Earth’s Atmosphere that comprises of 80% of its mass. Occurrence of major 
weather related activities is attributed to Troposphere. 

2.2.4 Climate Model 

Climate Model is the mathematical representation to portray the interaction 
between matter and energy in various regions of atmosphere, land and ocean 
of the climate system. These mathematical representations are based on the 
fundamental laws of physics, fluid motion, and chemistry. When a climate 
model’s coverage is global then they are known as Global Climate Models 
(GCMs). These climate models which are basically mathematical equations 
are solved for 3D grid of atmosphere with the help of high computing resources 
over a number of time steps. Figure 2.1 provides the insights of climate models. 
As given in Figure 2.1 each of 3-dimensional grid cells are the mathematical 
representation that describe the materials in it and the way energy moves 
through it. These mathematical representations are based on the fundamental 
laws of physics, fluid motion, and chemistry. 

Climate Model’s resolution is defined by grid cell size. Climate Model with 
higher level of details has smaller size of the grid cells. Climate Model with 
more details have more grid cells, so there is a need of more computing power 
to generate that Climate Model. Figure 2.2 depicts the typical spatial reso¬ 
lution used in climate models for four 1PCC Assessment Reports. In the first 
Assessment Report (FAR) in 1990, major climate models used a grid cells of 
about 500 km. In the second Assessment Report (SAR) published in 1996, 
spatial resolution of climate model’s grid cells was improved to 250 km. In 
the third Assessment Report (TAR) that is published in year 2001, grid cell 
size had reduced to about 180 km, while in Fourth Assessment Report (AR4) 
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Figure 2.1: Conceputalization of Climate Model 


climate models, grid size is reduced to 110 km. This improvement in spa¬ 
tial resolution allows such climate models to begin to make more accountable 
projections of regional climate in the future. 




Figure 2.2: IPCC (AR4 WG 1 Chapter 1 page 113 Figure 1.4). 


But as the coverage area is global and there are practical computing con¬ 
straints, spatial resolution is about hundreds of kilometers. For further refining 
the coarse resolution of global climate model outputs to finer resolution climate 
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information Downscaling process is used, so that this finer climate information 
is used for regional and local topography with some better accountability. 

A climate model is very similar to a weather forecast model, where weather 
forecast model is utilized for predicting short term atmospheric variability and 
change on the other hand climate model is for predicting long term atmospheric 
climate variability and change. Weather and realistic storms can be naturally 
simulated by climate models. 

General Circulation Models, also called as Global Climate Models and ab¬ 
breviated as GCMs. Global Climate Model are the mathematical formulation 
of the transfer of materials and energy through the climate system. Global 
Climate Models are based on well-documented physical processes. Some of 
input physical processes for the Global Climate Models are vegetation and 
soil wetness, glacial and sea ice, ocean circulations, human emissions of green¬ 
house gases and other pollutants, and some other inputs are wind direction, 
wind speed, air temperature, pressure, and humidity etc. Output of global 
climate model is known as climate projections, these are basically statistics 
and used to investigate the answer of climate related issues. Projections from 
the climate model are not expected to be exactly the same as those that really 
occur, but they should have the same typical characteristics. 

2.2.5 Climate Change 

Climate change is a change observed in the statistical distribution of weather 
patterns for a large period of time viz. decades to millions of years. Cli¬ 
mate change signifies the change in average weather conditions, or in the time 
variation of weather within the perspective of longer-term average conditions. 
Climate change is attributed to factors like biotic processes, non-uniform ex¬ 
posure of the Earth to the solar radiation, plate tectonics and volcanic erup¬ 
tions. Certain types of human activities are also attributed as major reasons 


13 



of recent climate change, that is also referred as global warming [Chen, 2012], 
However their is no general concurrence available in existing literature that 
which specific term should be used to refer to anthropogenic forced changes 
in climate viz. whether Global Warming or Climate Change. Investigators in 
the area of climate studies are continuously improvising to analyse historical 
and future climate by using climate related observations and mathematical 
models. Comprehensive historical climate record based on different processes 
like geological evidence from borehole temperature profiles, analyses of sedi¬ 
ment layers and stable-isotope etc., are continuously in development. GCMs 
based on the earth sciences are frequently utilized in theoretical approaches to 
synchronize the historical climate data-set, carry out future projections, and 
correlate causes and effects in climate change. Factors that can affect climate 
are termed as climate forcings or "forcing mechanisms. These are classified 
as internal or external mechanism. Internal forcing mechanisms are natural 
processes that are the components the climate system. For example, the ther¬ 
mohaline circulation. Exterior forcing factors can be either natural mechanics 
or anthropogenic. Anthropogenic mechanism are attributed to humans like in¬ 
creased emissions of greenhouse gases. Natural mechanisms are attributed to 
natural phenomena like changes in solar output, the earth’s orbit, and volcano 
eruptions. Physical evidence to observe climate change comprises of different 
types parameters like temperature, precipitation etc. Global records of surface 
temperature are available since late 19th century. In past periods, most of the 
conspicuous indications are indirect-climatic changes and they are correlated 
with changes in proxy indicators that reflect climate like glacial geology, ice 
cores, vegetation and sea level change. Other physical evidence includes cloud 
cover over arctic sea, melting of ice of glaciers and precipitation etc. [Shrestha 
et ah, 2014], 
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2.2.6 Climate Projection 


A climate projection is the mathematically simulated trend of the climate sys¬ 
tem to a scenario of future concentration or emission of Green-House Gases 
(GHGs) and aerosols, which is derived using climate models. In common 
practice, a projection can be considered as any portrayal of the future and 
the pathway leading to it. However, a more explicit meaning that has been 
linked to the term "climate projection" by the IPCC which refers projection of 
model-driven estimations of future climate. Climate projections are differen¬ 
tiated from climate predictions so as to revel the fact that climate projections 
dependent on the factors like concentration, emission and radiative forcing 
scenario used etc. These scenarios are based on assumptions that may or may 
not be realized in future. Therefore, climate projections are affected by those 
uncertainties that are not related to the climate system. 

2.2.7 Intergovernmental Panel on Climate Change (IPCC) 

For assessment of climate change, the leading international body is Intergovern¬ 
mental Panel on Climate Change (IPCC). IPCC was formed in 1988. IPCC was 
established by the World Meteorological Organization (WMO) and the United 
Nations Environment Programme (UNEP) and located at WMO headquar¬ 
ters in Geneva. IPCC as an intergovernmental body, opens its membership to 
all member countries of the United Nations (UN) and WMO. IPCC currently 
has 195 countries as members. In the IPCC’s plenary Sessions governments of 
member countries participate for the review process. The IPCC Chair and Bu¬ 
reau Members are elected and work programme are framed during the plenary 
Sessions. The Secretariat coordinates all the IPCC work. The Secretariat also 
act as a link to assist communication between member Governments. The ad¬ 
ministration of IPCC is regularized in accordance to WMO and UN rules and 
procedures, including codes of conduct and ethical principles (as outlined in UN 
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Ethics, WMO Ethics Function, Staff Regulations and 2012/07-Retaliation). 

As outlined in UN General Assembly Resolution 43/53 of 6 December 1988; 
the initial task for the IPCC was to prepare recommendations and comprehen¬ 
sive reviews with respect to the current scientific view and scientific facts of 
the science of climate change; potential impact of climate change on social, en¬ 
vironmental and economical factors; formulating realistically possible response 
strategies and elements for adapting and mitigating risks of climate change in 
future. 

IPCC does not monitor climate related data or parameter neither conduct 
any research on climate change. IPCC only assesses and reviews the latest 
scientific, technical, social, economical and environmental information com¬ 
piled worldwide to understand the climate change. IPCC reports are drafted, 
reviewed, accepted, adopted and approved in these plenary Sessions. To en¬ 
sure the completeness of the objectives and assessment of the current climate 
change information, IPCC adopt review as core part of its report generation 
and processing. To reflect wide range of expertise and views, IPCC accept the 
contribution of thousands of scientists from all over the world. 

Decision makers and policy makers get benefited by balanced scientific in¬ 
formation regardless of its geographical distribution form IPCC’s reports due 
to its scientific and intergovernmental nature. By the recommendation of IPCC 
reports, governments and other authority form the policies based on their sci¬ 
entific content. IPCC reports are neutral with respect to policy, on other hand 
they can deal objectively with economical, environmental, scientific, social and 
technical factors relevant to the nature of particular policies. 

IPCC First Assessment Report (AR1) of 1990 produced the scientific ev¬ 
idence emphasize the importance of climate change as a challenge requiring 
World Governments to tackle its effects. As an outcome of IPCC First As¬ 
sessment Report, United Nations Framework Convention on Climate Change 


16 



(UNFCCC) is created. UNFCCC is the key international treaty to cope with 
the consequences of climate change and to reduce global warming. As a re¬ 
sponse of the UNFCCC, Special Reports, Methodology Reports and Technical 
Papers are compiled for this purpose. Kyoto Protocol in 1997 is the response of 
the IPCC Second Assessment Report (AR2) of 1995. The IPCC Third Assess¬ 
ment Report (AR3) was published in 2001 and the IPCC Fourth Assessment 
Report (AR4) was published in 2007. IPCC AR4 focuses significantly on the 
integration of climate change with sustainable development policies and rela¬ 
tionships between mitigation and adaptation. Due to these efforts at the end 
of 2007, the IPCC was awarded the Nobel Peace Prize. IPCC Fifth Assess¬ 
ment Report (AR5) was published in 2014. IPCC Sixth Assessment Report 
(AR6) will be published in 2022, however the first order draft is completed 
and second order draft is in review. 

The geographical distribution of participation of the scientific community 
has grown greatly for the work of the IPCC. Authors and contributors from 
all geographical locations involved in compiling and reviewing the reports. 
Geographical distribution of the topics covered by the reports has also grown 
significantly. 

2.2.8 Downscaling 

Downscaling is method adopted to deduce high-resolution information from 
low-resolution variables. This technique is based on dynamical or statistical 
approaches commonly used in different specializations like meteorology, cli¬ 
matology and remote sensing. Downscaling generally signifies refining spatial 
resolution, but it is sometimes also used for refining temporal resolution |Lloyd 
and Winsberg, 2018; Wilby and Wigley, 1997]. 

After obtaining the regression equation, the error and correlation between 
the target and the downscaled time series are obtained. Error and correlation 
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values provide overview to the analyst that how well regression equation rep¬ 
resents the historical record. Finally the equation and similar set of predictors 
obtained from GCM are used to obtain historical condition and future projec¬ 
tions downscaled to the desired AOI at the specified scale of spatial resolution 
( for example 0.5° ). As downscaling is required to be carried out, it is imper¬ 
ative to carefully acquire a set of predictors from the GCM that adequately 
signifies the mathematical relation between the large scale atmosphere and 
the local scale climate parameters that are related to specified AOI. This se¬ 
lection is particularly become important where the topographical condition of 
AOI are very complex or there are some dominating predictor sets that affect 
AOI. For example, AOI located in mountainous or coastal areas. There can be 
large variations in model performance in these types of topographical regions 
[McGufhe and Henderson-Sellers, 2005]. 

Downscaling techniques are discussed in detail in Chapter 3 and Chapter 4. 
In chapter 3 downscaling and its classification have been discussed in detail. 
Chapter 4 provides the details of mathematical background related to the 
statistical downscaling method. 
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Chapter 3 

Downscaling 

3.1 What is Downscaling? 

Downscaling is the process of linking two sets, one set with variables refer¬ 
ring to large scale and another set with variables referring to small scale. In 
this research work, set of large-scale variable represent circulation pattern over 
a large spatial extent or region and set of variables referring the small scale 
can either of the climate variables like precipitation, temperature humidity etc. 
These measurements are the station measurement continuously collected from 
a given location during the specified time. 

Large-scale variables have characteristic of varying smoothly and slowly 
in their spatial domain. Small-scale variable does not have smooth and slow 
varying characteristic. These small-scale variables are measured in the sta¬ 
tion in form of observations acquired through instruments like rain gauges, 
thermometers, barometers etc. 

Using a downscaling approach, a real and physical mathematical relation¬ 
ship between these set of large-scale variables and set of small-scale variables 
are developed and evaluated. This mathematical relationship increases the 
reliability of climate projections and can be utilized for better projections and 
predictions pertaining to climate change. 
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Spatial scale selected for downscaling affects the refinement of climate vari¬ 
ables. Spatial scale is large for the large-scale variables if they vary smoothly 
and slowly. Spatial scale is small if the climate-related changes that occurred 
over the small region are noticeable. Small-scale variable exhibits high co- 
variance with a noteworthy component of the space, which are defined by the 
spatial coverage region of the large-scale variable, as the small-scale variable 
is contained within this space[Wilby et al., 2004; Wilby and Dawson, 2013; 
Wilby et ah, 2002, 1998], 

In other words, small-scale is referred to as local scale rather than the 
process which solely involves small spatial scales. Downscaling is only possi¬ 
ble when large-spatial scale ought to be coupled with a local process in some 
manner. In absence of this coupling between variables, downscaling will pro¬ 
duce non-interpretable outputs as climate projections. Hence, the coupling 
between the variables is the mandatorily required for the downscaling. As the 
coupling or the relation between the large-scale variable and the small-scale 
variables exhibit definitive relationship, downscaling produces more reliable cli¬ 
mate predictions and better projection results pertaining to the climate change 
[Bhuvandas et ah, 2014; Fowler et ah, 2007]. 

To identify matching and harmonized time behavior on large-scale and 
small-scales, statistical downscaling approaches focus on the time dimension. 
Temporal variation mathematically represented as the function of time with a 
given time structure. The corollary of similar time structure on diverse spatial 
scales demonstrates the high temporal correlation. Downscaling can be carried 
out both in spatial and temporal contexts. Spatial and temporal contextual 
properties are moreover related to set of variables in which downscaling is 
performed |Bhuvandas et ah, 2014; Ekstrom et ah, 2015]. 
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3.2 Why Downscaling? 


Downscaling is used to make inference about some small-scale or local-scale 
climate variable at a given spatial location using a large-scale variable from 
global climate models. Without the downscaling, it is not relevant for real life 
or practical use to consider the global mean values from the large-scale vari¬ 
ables derived from global climate models. For example, using downscaling ap¬ 
proaches, investigators intend to apply global mean values from the large-scale 
variable to predict the precipitation at a local level or at smaller spatial ex¬ 
tent. Though general circulation models (GCMs) characterize an essential role 
for studying climate change, however, it is not possible for general circulation 
models to give a practical useful prediction of the local climate. Consequently, 
it is desirable to downscale the climate projection from the general circulation 
models either through a high-resolution regional climate model (RCM) by the 
dynamic downscaling or through local station data by the statistical down- 
scaling. Owing to following limitations of GCMs, downscaling is essentially be 
carried out to investigate the climate change at a local level: 

1. Using the general circulation models, investigators are not able to carry 
out prediction of the real climate system at local scale, despite the fact 
that the general circulation models provide a reasonable prediction of 
the climate system on global scales. 

2. Considering from the requirements of real-world problems, general circu¬ 
lation models are not suitable for providing high-resolution dataset for 
climate studies at a local scale. Currently, general circulation models 
datasets have low resolution and technical issues like artificial climate 
drift, atmosphere-ocean coupling, direct industrial effect and cloud rep¬ 
resentation etc. are not incorporated by them completely [Diaz-Nieto 
and Wilby, 2005; Haarsma et ah, 2016]. 
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3. Association of mathematical uncertainties with general circulation mod¬ 
els and downscaling methods also exists. Moreover, these uncertainties 
interfere with the realistic prediction of climate change [Chen et al., 2011; 
Geerts B. and Linacre E., 1998; Hargreaves, 2010; Stouffer et ah, 2017; 
Wilby et ah, 1998]. 

4. Since the general circulation models have a coarse spatial resolution, they 
are incapable to signify aspects with spatial extent smaller than the spa¬ 
tial extent of general circulation models grid box size. Moreover, general 
circulation models are not capable to report for significant variations in 
the climate statistics inside a small region like the precipitation within 
the small cities or town. Through the application of downscaling, for 
smaller spatial extent, climate statistics with comparatively better re¬ 
liability can be derived [Smerdon, 2012; Wilby et ah, 2004; Wilby and 
Dawson, 2013]. 

3.3 Classification of Downscaling Methods 

Downscaling is a broad study held in itself. There are numerous methods 
to classify downscaling methods. As the availability of large number of down- 
scaling methods, there is a need of classification of these downscaling methods. 
Downscaling methods (approaches) can be classified into different ways. Gen¬ 
erally downscaling methods can be classified on basis of applied mathematical 
concepts and usage. These methods as per the first classification are briefly 
discussed in the subsequent sections. 

1. Simple Change Factors 

2. Synthetic Statistical 

3. Advanced (Deterministic Statistical) 

4. Dyanamical 
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The second classification is based on basis of downscaling methods usage. 
Downscaling methods can be classified as per their usage. They are listed 
below: 

1. Simple Downscaling Methods 

(a) Analogues 

i. Spatial Analogue 

ii. Temporal Analogue 

(b) Change Factors ( Delta Method ) 

(c) Bias Correction 

i. Local Scaling 

ii. Quantile-Quantile (QQ) Correction 

(d) Asynchronous Regional Regression Model (ARRM) 

(e) Bias Correction Constructed Analogues (BCCA) 

(f) Bias Correction Spatial Downscaling (BCSD) 

(g) Multivariate Adaptive Constructed Analogs (MACA) 

2. Advanced Downscaling Methods 

(a) Statistical Downscaling 

(b) Dynamical Downscaling 

3.4 Simple Downscaling Methods 

Simple downscaling methods are the least complex and the least expensive 
methods of downscaling [Maraun and Widmann, 2018]. They are mainly been 
developed for generating higher resolution information from GCMs for de¬ 
veloping impacts models. These techniques employ comparatively simplified 
transformation of the coarser outputs of GCMs, mainly temperature and pre¬ 
cipitation [Maraun and Widmann, 2018]. Following subsections discuss the 
major types of simple downscaling methods. 
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3.4.1 Analogus Downscaling Methods 


Analogues downscaling methods implement generation of approximate repre¬ 
sentation of the weather pattern for a specified day using the either one or 
number of histrorical weather patterns. Foremost implementation of ADM 
was carried out by Zorita and von Storch [1999] for downscaling the GCM 
data for Madrid, Spain. Analogues downscaling methods are applied in the 
case when climate models are not available. Investigator choose or select ob¬ 
servation from some spatial location or some time period to reflect the climate 
change in the area of interest. Analogues methods can further be classified as: 

3.4.1.1 Spatial Analogue 

In spatial analogue methods, region having similar climatic condition to the 
area of interest is selected for downscaling. It is simple method. However, since 
it is difficult to find regions having similar climatic conditions, applicability of 
the method is limited. 

3.4.1.2 Temporal Analogue 

In temporal analogue methods, time period having desired climate is selected 
for downscaling. It is simple method. However methods are inflexible. How¬ 
ever it is difficult to fold time period with propers. 

3.4.2 Change factors (Delta Method) 

Change Delta method is one of the generally used downscaling method. It is 
most commonly used in investigation of the affects of climate change on wa¬ 
ter resources. In this method, changes in climate are estimated by carrying 
out comparative analysis of the future climate and current simulated climate 
derived through GCM and ROM models. Subsequently these changes are 
combined with observed local datasets having higher resolution. In general, 
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deviations in temperature (maximum and minimum) are augmented to the 
observed temperature values, and ratios of precipitation (future precipitation 
divided by the current precipitated) are combined with current observations 
using multiplication. In this approach, model biases are intrinsically corrected, 
since the scenarios are generated by altering observations. However, the cor¬ 
rections are only applicable to the mean of the climate change. Change factor 
method can be combined with the more complex stochastic methods. 

3.4.3 Bias Correction 

General circulation models have biases. Presence of the biasness in the model 
confines their direct utilization in impact studies. These biases can magnify 
since they may propagate by the impact models, finally resulting in simula¬ 
tion biases [Franzke et ah, 2015; Haarsma et ah, 2016; Stouffer et ah, 2017]. 
Bias-correction methods can eliminate the systematic biases from the general 
circulation models, but they are unable to eradicate the unsystamatic biases 
applied to the general circulation models. Wilby et ah [2004] shows how mean 
standardization is corrected by bias-correction methods. Wood et ah [2004] 
shows how distribution in the quantile-based mapping is corrected by bias- 
correction methods. Likewise the general circulation models, regional climate 
models also have the biases [Maraun and Widmann, 2018; Shrestha et ah, 
2014], 

3.4.3.1 Bias Correction (local scaling) 

The bias correction with local scaling method intends to match the monthly 
mean of corrected values with that of observed values [Lenderink et ah, 2007]. 
The method utilizes the monthly correction values calculated using the differ¬ 
ences between observed and raw data for downscaling. Generally, precipitation 
values are corrected month wise using a multiplier and temperature with an 
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additive correction. Various point observations are taken to compute correc¬ 
tions. Subsequently, corrections have been applied on model outputs likewise 
rubber-sheeting process in order to obtain appropriate match for the output. 

3.4.3.2 Bias Correction ( Quantile-Quantile (Q-Q) Correction) 

Quantile-Quantile correction method is similar to the local scaling method. 
However, in this method the entire distribution is corrected [Mearns et al., 
2018]. The method was specially formulated to assist in the climate change 
studies. It intends to carry out correction on the distributions of the variables, 
hence possibly reducing the bias of extreme events estimators in climate model 
simulations [Cannon et ah, 2015; Jeon et ah, 2015; Quintana Segui et al., 
2010; Zhang et al., 2017]. The basic concept for quantile-quantile correction is 
discussed below [Scheuerer, 2018]. 

Let l be a location associated with a given analysis grid point and r be a 
location associated with given forecast grid point in the proximity of /. The 
basic objective of quantile mapping is to determine for each predicted f x , to 
which quantile qf, x (k);k G [0,1] of the predicted local dataset it correlates 
with, and then map it to the corresponding quantile q 0 ,i{k ) of the observation 
dataset. The quantile functions q^ x and q 0 j are approximated from the training 
sample; specifically investigator calculate the sample quantile qf, x (n/ 100) and 
qoAn/m f° r n £ {I; 2,..., 99} and perform linear interpolation between these 
distinct values. 

3.4.4 Bias Corrected Constructed Analogue (BCCA) 

Bias Corrected Constructed Analogue (BCCA) method utilizes the historical 
climate observations to apply the correction. In BCCA, bias correction has 
been applied to GCM outputs. Subsequently, a set of similar 30 observed days 
will be selected on basis of comparison of bias corrected GCM future projection 
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with the historical observations. The similarity is reckoned on basis of spatial 
pattern and intensity of parameters like precipitation or temperature. A linear 
combination of the 30 selected days is developed to produce an analogue that 
nearly matches with GCM projections. The computed coefficients for the 30 
historical days are applied to the local observations having the similar temporal 
resolution to generate a downscaled analogue. One major concern with this 
method is that owing to specific characteristics of GCM projections analogues 
may not exist for downscaling [Maurer et ah, 2010; Walton et al., 2015]. 

3.4.5 Bias Correction Spatial Downscaling (BCSD) 

Bias-corrected spatial disaggregation (BCSD) link the quartiles of the GCM 
outputs to historical local dataset patterns in order to generate daily time series 
and finally construct the downscaled grid. In more detail, BCSD comprises 
of trend removal, applying bias correction by linking link the quartiles of the 
GCM outputs to historical local dataset patterns and spatially resolving the 
local variables by interpolation of the bias-corrected anomalies and imposition 
of climatological means at finer scale [Hidalgo et ah, 2008; Maurer et ah, 2010; 
Mearns et ah, 2018]. 

3.4.6 Asynchronous Regional Regression Model (ARRM) 

Asynchronous Regional Regression Model (ARRM) is a comparatively com¬ 
plex statistical method that applies quantile regression to develop relationships 
between two quantities which are approximately normal distributed. These 
quartiles do not necessary have temporal correspondence, but should have 
similar statistical properties such as mean and variance. More specifically, 
ARRM uses piecewise regression to develop the relationship between observed 
and modelled quantiles and then downscale future projections [Mcginnis et ah, 
2014; Stoner et ah, 2013]. ARRM at first, applies a sorting procedure to global 
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and local quantiles and map them against one another. Subsequently it Ends, 
a six breakpoints between segments using linear regression over a moving win¬ 
dow of fixed width to find points where the slope of the Q-Q map changes 
abruptly. Finally, it constructs a linear statistical model with the help of these 
breakpoints. 

3.4.7 Multivariate Adaptive Constructed Analogs (MACA) 

The Multivariate Adaptive Constructed Analogs (MACA) method is com¬ 
monly used in wildfire applications [Abatzoglou and Brown, 2012], Based on 
an observation based training dataset downscaling is performed by multivari¬ 
ate adaptive constructed analogs. Correction for biases in general circulation 
models output and spatially downscaling is performed by a multi-process ap¬ 
proach. The latter is proficient uses the historical analogs from observational 
data that signifies various characteristics of regional climate [Abatzoglou and 
Brown, 2012], 

3.5 Advanced Downscaling Methods 

Researchers have classified the advanced downscaling as two major types viz. 
statistical downscaling and dynamical downscaling methods. Statistical down- 
scaling (SD) consider the fact that regional climate in influenced by two key fac¬ 
tors viz. (i) the large scale climatic state, and (ii) regional/local physio-graphic 
features (e.g. topography, land-sea distribution and land use. From this point 
of view, regional or local climate information is extracted by first developing a 
statistical model that relates the large-scale climate variables (predictors) like 
surface air temperature and precipitation to regional and local variables (pre- 
dictands). Subsequently, large-scale output of a GCM simulation is inputted 
into the developed statistical model to estimate the corresponding local and 
regional climate characteristics like rainfall, temperature etc. Statistical down- 
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scaling method and related research works are discussed in detail in section 
3.10. 

Dynamical downscaling approaches utilizes the regional climate models 
(RCMs), which generate finer resolution output on basis of relationships of 
atmospheric parameters for an AOI while taking GCM fields as boundary con¬ 
ditions. The physical consistency between RCMs and GCMs is controlled by 
the agreement of their large-scale circulations [Franzke et al., 2015]. General¬ 
ized steps of downscaling are shown in Figure 3.1. As conspicuous from Figure 
3.1, dynamic and statistical downscaling processes are sometimes integrated 
together in order to obtain the optimal downscaling results in time and space 
[Hillel and Rosenzweig, 2011; Lloyd and Winsberg, 2018]. 



Figure 3.1: Generalized View for Downscaling Process 


3.6 Qualities of Downscaling 

Downscaling is a process having specific computation cost and time complexity 
associated with it. As this process has processing cost, so there must be some 
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qualities in the process and in the output produced by process. Some of these 
qualities of the downscaling process are given below : 

Accuracy : It must be capable of reproducing high resolution historical records 

Feasibility : Downscaling methods must not be computationally complex. 

Practical : Tractable to observations from immediate use in applied studies. 

Products : It must be able to use multiple variables (predictors). 

Resolution : Output climate projection with high spatial resolution (<10km). 

Synchronicity : Downscaled weather fields adhere to the fundamental phys¬ 
ical laws. 

3.7 Data Used In Downscaling 

As discussed earlier in this chapter downscaling is the process of linking two 
data-sets with a mathematical relationship viz. one referring to large scale 
variables and other referring to small scale variables. In this research work, 
set of large-scale variables represent the circulation pattern over a large spatial 
extent or region and set of small scale variables that represent the climate vari¬ 
ables like precipitation, temperature humidity etc. for smaller spatial extent. 

Large-scale variables are extracted from GCMs, as the out-iputs of these 
climate models. These large-scale variables represent the climate statistics for 
an area with a large geographical extent or large spatial extent. As these 
large-scale variables are the representation of the large spatial extent, these 
large-scale variables are come under the category of grid data. 

Small-scale variables, which are the measurements from the climate station, 
represent the climate statistics for an area with small geographical extent or 
small spatial extent. As these small-scale variables are the representation of 
the small spatial extent, these small-scale variables are categorized as below: 
with some characteristics associated with it. 
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I. Gridded Data 


/ Spatially complete. 

/ Relies on gridded observations. 

/ Biases in obs —>■ downscaled products 

II. Point Observations (Station Data) 

/ Integrated into climate applications Currently. 

/ Eliminates biases in downscaled products. 

/ Subject to data availability and consistency. 

3.8 Applied use of downscaled data 

Sophistication of downscaling methods varies accordance to the need of end 
user application. The output of these downscaling methods is available as 
downscaled data. This downscaled data has the capability to reliably charac¬ 
terize the climate change of the given geographical extent. Downscaled data 
has many characteristics associated with them, some of the important charac¬ 
teristics are given below. 

I. Data needs 

/ Spatial resolution. 

/ Temporal resolution. 

/ Multi-variable. 

II. Vulnerability Assessment 

/ Extreme events. 

/ Inter-annual variability. 

/ Coincident sequencing. 
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III. Uncertainty Information 


/ Time horizon. 

/ Models used. 

As downscaled data is the product of downscaling methods, selection of 
down-scaling methods is very important factor because not all downscaling 
methods are suitable for generating useful information as per the requirement 
of climate change studies. 

3.9 Background Research Works 

Most of the downscaling methods were developed on basis of conceptual and ex¬ 
perimental developments pertaining to weather forecasting studies of the 1950s 
and 1960s. Klein et al. [1960, 1959, 1967] developed the perfect prognosis (PP) 
method to estimate the probability and type of precipitation, maximum, min¬ 
imum and average temperatures, cloudiness and visibility at meteorological 
stations from numerical weather predictions. At first, authors generated the 
temporally synchronized statistical relationships between the climate param¬ 
eters of interest and the observations of coarse-resolution climate parameters 
similar to weather forecasting models. 

Subsequently, numerical model output was applied to the statistical re¬ 
lationships to estimate local weather at specified times. Finally, forecasting 
skill was subsequently used to gauge the efficacy of the developed model. For 
climatological applications, skill in forecasting measures the goodness of a fore¬ 
cast over a specified historical baseline of past observations. The forecasting 
approach may result in completely distinct skill measurements at different 
places, or maybe within the same place in various seasons. For example, 
spring weather can be driven by erratic local endemic conditions, whereas 
winter cold spur may correlate with discernible polar winds. The mean square 
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error, the coefficient of correlation between the forecasts and observations and 
nonsystematic bias in the forecast were some of the skill scores used in climate 
research [Murphy et al., 1989]. 

In continuation of above preliminary developments, Model Output Statis¬ 
tics (MOS), which is based on the statistical relationship between the local 
climate observations and output from the numerical model for specified pro¬ 
jection time was developed and implemented by the researchers |Blocker, 1982; 
Glahn and Lowry, 1972; Klein and Hammons, 1975]. Thus, any biases in 
the numerical forecasts are compensated while applying the statistical scaling 
relationships. However, recalibration is considered necessary for each modi¬ 
fication to existing, or development of new numerical forecast models. The 
foremost research work to transform GCM-scale output using PP approach is 
attributed to the seminal work carried out by Kim et al. [1984], Authors used 
the monthly average surface temperature and monthly total precipitation at 
49 meteorological stations in the State of Oregon to develop the mathematical 
relationship between GCM and local observations using empirical orthogonal 
functions (EOFs). The first EOF was able to elucidate the total variance of 
about 81% and 79% for the precipitation and temperature observations respec¬ 
tively. The importance of this seminal work was in the line of the fact that 
the prediction pertaining to local climate impacts can be feasible on basis of 
trends exhibited by the time-series of month wise weather anomalies observed 
at GCM grid-points that are superimposed over the area of interest. 

Subsequently, the work of Kim et al. [1984] was advanced by Wigley, T. 
et al. [1990]. In addition to area average temperature and precipitation, au¬ 
thors used other predictor variables like geopotential heights, airflow gradients, 
the mean sea level pressure to downscale the GCM data to the local grid. On 
the basis of validation of their developed model using independent data, au¬ 
thors reported the spatial-mean explained variances range from 39% to 76% 
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for precipitation and from 58% to 87% for temperature. Authors asserted that 
most of the explained variance arises due averaging of predictand over the 
grid. They also demonstrated that site-specific changes can vary noticeably 
from those at the equivalent GCM grid-scale. 

Karl et al. [1990] were foremost researchers who utilized GCM outputs to 
replicate the observed surface climate. They termed their approach as ‘clima¬ 
tological projection by model statistics’ (CPMS). They have used the primitive 
version of two-level atmospheric GCM developed by the Oregon State Univer¬ 
sity for downscaling using PP and MOS approaches and local temperature, 
precipitation data acquired from five meteorological stations in USA. Twenty- 
two predictor variables acquired from this GCM were used to estimate daily 
cloud ceiling, precipitation and temperatures. They concluded that the MOS 
approach is preferable to PP approach since biases in the variances and mean 
of the GCM outputs can be removed by application of MOS. In continuation 
of the aforementioned research efforts, Von Storch et al. [1993] were the fore¬ 
most researcher who had coined the term downscaling. They had employed 
PP approach for investigating Iberian rainfall anomalies in winter and sea 
level pressure field anomalies over the North Atlantic. Authors reported that 
the downscaled precipitation differed significantly from the GCM estimate of 
precipitation at the same locations. 

Nearly at the same time, the preliminary ROMs were also developed by 
different researchers Giorgi [1990]; Giorgi and Bates [1989]. One of the earliest 
efforts towards the development of ROM using the nested regional modeling 
experiments. They were conducted in the western USA, where the intricate 
topography and coastline wield the significant influence over rainfall and tem¬ 
perature patterns. As reported by Giorgi et al. [1994] Pennsylvania State Uni¬ 
versity/National Center for Atmospheric Research mesoscale model (MM4) 
having a spatial resolution of 60 km was tested using month-long winter time 
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simulations. MM4 was able to incorporate the topography of the region more 
realistically. In Europe, a similar model was used to carry out a double C02 
concentration experiment Giorgi et al. [1992], Formative climate change in¬ 
vestigations with ROMs were, however, hindered their efficacy to execute long 
simulations owing to limited computational facility, and non-availability of 
the historical GCM outputs at six-hourly frequencies. However, some of the 
mathematical principles governing the selection of domain size and resolution, 
as well as consideration of boundary forcing were soon developed [Giorgi and 
Mearns, 19911. 

During the last two decades, numbers of research works have been reported 
in the area of downscaling of GCM outputs. Comprehensive review of down- 
scaling methods have been also been carried out by different researchers [Fowler 
et ah, 2007; Giorgi and Mearns, 1991; Hanssen-Bauer et ah, 2005; Hewitson 
and Crane, 1996; Maraun et ah, 2010; Von Storch et ah, 1997; Wilby and 
Wigley, 1997; Xu, 1999]. 

Wilby and Wigley [1997] discussed the different downscaling approaches 
in detail. Authors investigated the latest downscaling approaches under four 
main classification namely limited-area climate models, regression methods, 
stochastic weather generators and weather pattern-based approaches. On ba¬ 
sis of comparative analysis of these different approaches, authors reported that 
owing to feasibiliy of their implementation and less computing power requiren- 
ment, regression methods are preferred methods of downscaling More recently, 
Maraun et ah [2010] carried out a detailed review of the downscaling of pre¬ 
cipitation data for climate change research. Barsugli et ah [2009] and Olsen 
and Gilroy [2012] asserted that downscaling techniques can generate fine reso¬ 
lution data at a local scale however, they will not correct large-scale errors in 
GCMs. Henceforth, advanced GCMs are in continuous development [Garcia 
et ah, 2014], However, in spite of the substantial number of research works 
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in the area of downscaling, no agreement on the selection of an appropriate 
method for particular downscaling/bias-correction application has come out 
that result in some type of uncertainty in associated climate change impact 
investigations. As an example, Chen et al. [2011] reported the substantial 
uncertainty pertaining to the selection of downscaling methods. In the line 
of this problem, Puma [2012] developed the general guidelines for selecting a 
particular downscaling approach for climate scenario development on basis of 
complexity of analysis and spatiotemporal resolution (Figure 3.2). 
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Figure 3.2: Selection of Downscaling Method Based on Spatio-Temporal Res¬ 
olution [Puma, 2012] 


3.10 Statistical Downscaling 

3.10.1 Detailed Description 

Statistical downscaling techniques usually ascribe to methods that statistically 
establish the relationship between larger scale atmospheric attributes from 
GCMs (predictors), such as greenhouse gas concentration, to local scale climate 
variable (generally point or grid estimates) (predictand), for example, monthly 


36 








temperature or precipitation. Moreover, owing to the facts that even Regional 
Climate Models (ROMs) do not have fine resolution to be used directly for 
impact studies and they are half biased, statistical downscaling is also required 
for ROMs as well. In addition to the aforementioned reasons, other major 
reasons of adopting statistical downscaling for downscaling RCM dataset are 
that it is computationally non-intensive and it facilitates generation of un¬ 
biased downscaled data. 

There are various types of statistical downscaling that utilize different tech¬ 
niques, like artificial neural networks, weather generators, weather classifica¬ 
tion typing [Fowler et ah, 2007; Giorgi et ah, 1994], Even though statistical 
downscaling methods vary both in terms of their intricacies and principles, 
they have two basic common processing steps viz. a training step and an ap¬ 
plication step. The training step uses two datasets. These two datasets are 
real-world local observations like precipitation or temperature and outputs of 
a physical model like RCM, GCM, reanalysis etc. Both sets are in synchro¬ 
nization with some past time period. Typically, the local observations will be 
of higher spatial resolution than the GCM outputs. During the training step, 
a mathematical relationship between these two datasets is developed through 
techniques like regression, forming a linkage between local observations and 
GCM outputs. These techniques represent linear or non-linear relationships 
between the local scale variable and large scale predictors. For example one 
has to obtain 0.5° precipitation data over the given area of Interest(AOI) using 
simple linear regression. At first, the sufficient historical observations from a 
weather station or a grid observation product from a normal area are acquired. 
Subsequently relevant predictors are acquired from historical reanalysis data 
like GCMs that are in sync with time with the historical observations of the 
station. Subsequently the regression equation is developed for. General ex¬ 
pression of this equation is given in equation 3.1. 
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n 


(3.1) 


1 — C T ^ ^ O.jAj 
i= 1 

where: Y = Historical Observations 
c = Bias or Constant 
n = Number of Parameters 
a,i = i th Cofficient 
Xi = i th GCM Parameter 

After obtaining the regression equation, the error and correlation between the 
downscaled time series and the target are obtained. Error and correlation 
values provide the analyst to obtain the overview that how well regression 
equation represent the historical record. Finally the equation and similar set 
of predictors obtained from GCM are used to obtain historical condition and 
future projections downscaled to the desired AOI at the desired scale of spa¬ 
tial resolution ( for example 0.5° ). As downscaling is required to be carried 
out, it is imperative to carefully acquire a set of predictors from the GCM 
that adequately represent the relationship between the large scale atmosphere 
and the local scale climate variables related to specified AOI. This selection 
is particularly become important where the topographical conditions of AOI 
are very complex or there are some dominating predictor sets that affect AOI. 
For example, AOI located in mountainous or coastal areas. There can be 
large variations in model performance in these types of topographical regions 
[McGuffie and Henderson-Sellers, 2005]. 

In the application step, the derived mathematical relationship is applied 
to a set of GCM outputs pertaining to a different time period. The output of 
the application step is the generation of a dataset of surrogate observations, 
or downscaled results. In relation to climate change, the GCM outputs corre¬ 
spond to a specified future state for which different forcing parameters (e.g., 
greenhouse gases) have altered. The underlying concept behind the statistical 
downscaling for climate change investigation is to recalibrate the raw GCM 
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outputs for future climate state, imparting attributes of the observations dur¬ 
ing the historical (training) period, and in the process generating information 
at a higher spatial resolution as a substitute to the coarser information avail¬ 
able at GCM level. Climate projections that have been advanced by statistical 
downscaling techniques are available as local observations at a fine resolution 
which generally are considered to be more suitable for use in many climate 
impact studies instead of the raw outputs of the GCM models. 

Maraun et al. [2010] carried out further classification of the statistical down- 
scaling methods into three types as prefect prognosis (PP), model output 
statistics (MOS) and weather generators. In PP, the statistical downscal¬ 
ing relationships are using local observations. In MOS, gridded ROM sim¬ 
ulations and observations are utilized for developing the downscaling model. 
Weather generators are hybrid downscaling methods that use either PP/MOS 
approaches or both of them. 

Pertaining to the types of statistical methods, downscaling can be classi¬ 
fied as (i) categorical, (ii) continuous-valued and (iii) hybrid [Fowler et ah, 
2007; Wilby and Wigley, 1997]. In categorical downscaling, classifications and 
clustering approaches are used to develop statistical model between predictors 
and predictands [Zorita and von Storch, 1999]. In continuous-valued downscal¬ 
ing, models are developed using regression analysis in order to the map large 
scale predictors and local-scale predictands [Chandler and Wheater, 2002], 
Mehrotra and Sharma [2010] carried out a non-parametric stepwise predictor 
identification analysis. In hybrid downscaling, various statistical approaches 
referred as weather generators are integrated together for downscaling Wilby 
et al. [2002], 

Statistical downscaling model (SDSM) is widely used for statistical down- 
scaling. It is developed by Wilby et al. [2002], As per Wilby and Dawson 
[2013], SDSM is highly cited software in the climate research. Moreover, Long 
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Ashton Research Station Weather Generator (LARS-WG) is available to down- 
scale weather data for a single site for the current and future climate [Semenov 
and Brooks, 1999]. LARS-WG is a proprietary software. 

Large-scale climatic state and regional/local physiographic features (land 
use, land-sea distribution, topography etc.) are the two factors on which the 
regional climate depends. These fact about the regional climate is utilized by 
the statistical downscaling [von Storch and Navarra, 1995, 1999; von Storch 
and Zwiers, 1999; Zorita and von Storch, 1999]. From this point of view, 
small-scale or local climate statistics is derived by developing statistical down- 
scaling models that relates the large-scale climate variables like precipitation 
and surface air temperature to small-scale or local climate variables. Subse¬ 
quently, large-scale output of a GCM simulation is inputted into the developed 
statistical model to estimate the corresponding small-scale and local climate 
characteristics (climate statistics) pertaining like rainfall, surface air tempera¬ 
ture etc. 

Key advantages and disadvantages of statistical downscaling model are as 
described next. 

I. Advantages 

/ Not computationally intensive. 

/ Applicable to GCM and ROM output. 

/ Provide Station/point values. 

II. Disadvantages 

/ Lack of long/reliable observed series. 

/ affected by biases in the ROM and GCM. 

/ Not physically based e.g. climate feedbacks. 

/ Under-estimate variability and extremes. 

/ Assume stationary relationships in time. 
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3.10.2 Existing Research Works 


During last few decades, different types of statistical downscaling approaches 
(methods) have been developed with a broad range of applications pertaining 
to climate change studies [Fowler et ah, 2007; Giorgi et al., 2001; Hessami 
et ah, 2008; Khalili et ah, 2013; Lafon et ah, 2013; Fee and Jeong, 2014; 
hin et ah, 2017; Orlowsky et ah, 2010; Wilby et ah, 2002], Among different 
statistical downscaling methods, the regression approach is the most widely 
used statistical downscaling technique due to ease in its implementation and 
its relative less computational requirements in comparison to other methods 
[Tang et ah, 2016a]. In the continuation aforementioned discussion, selected 
research works pertaining to statistical downscaling methods are discussed in 
the subsequent sections. 

Khalili and Nguyen [2018] proposed a new statistical approach to the down- 
scaling of daily maximum and temperature series for many different sites con¬ 
currently. Their approach is based on an integration of the modelling of the 
relation between local daily temperature extremes and the global climate pre¬ 
dictors using regression method and modeling the stochastic component based 
using Singular-value decomposition (SVD) technique. Authors have used data 
of ten weather stations located in southeast region of Ontario and the south¬ 
west region of Quebec in Canada and two GCM datasets namely HADCM3 
and CGCM3. They concluded that the proposed SVD-based SD procedure 
was tested and proven to be an effective tool for downscaling temperature 
extremes for many sites concurrently. 

Serur and Sarnia [2017] analysed the temperature and precipitation char¬ 
acteristics of Weyib River basin in Ethiopia in order to investigate the effect 
of climate change. They have used CanESM2 model for the RCP2.6, RCP4.5, 
and RCP8.5 scenarios. They have the developed the statistical downscaling 
model using the observed daily data of 12 meteorological stations. 
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Chen et al. [2018] developed the software namely Generator for Point Cli¬ 
mate Change (GPCC) to generate the daily time series of climate change sce¬ 
narios for local and AOI related climate change studies while utilizing the 
monthly climate projections from RCMs and GCMs for a particular grid. 
GPCC downscales monthly projections of GCMs or RCMs in a grid box to 
daily weather series at a point scale or station, and forecasting the minimum 
temperature of Raipur City. They had developed the statistically downscaled 
model for predicting the temperature for three future periods using Canadian 
Global Climate Model (CGCM) predictors for A1B and A2 climate forcing con¬ 
ditions. Authors have used Statistical Downscaling Model (SDSM) software 
using k-fold validation technique for forecasting the temperature for three dif¬ 
ferent 15 years of time frame viz. FP-1 (2020-2035), FP-2 (2046-2064), and 
FP-3 (2081-2100). They reports CGCM model parameters viz. specific hu¬ 
midity at 850 hpa (nceps850gl), 500 hpa geopotential height (ncepp500gl), and 

surface airflow strength (ncep_fgl) were the most appropriate parameters to 

predicting future scenarios. Moreover they reported that comparison of mean 
monthly minimum temperature of generated scenarios with base period re¬ 
sulted in 1.1-11.2% increase in minimum temperature for A1B climate forcing 
condition. 

Jaiswal et al. [2018] applied the statistical downscaling technique for down- 
scaling and forecasting the minimum temperature of Raipur City. They had 
developed the statistically downscaled model for predicting the temperature 
for three future periods using Canadian Global Climate Model (CGCM) pre¬ 
dictors for A1B and A2 climate forcing conditions. Authors have used Sta¬ 
tistical Downscaling Model (SDSM) software using k-fold validation technique 
for forecasting the temperature for three different 15 years of time frame viz. 
FP-1 (2020-2035), FP-2 (2046-2064), and FP-3 (2081-2100). They reports 
CGCM model parameters viz. specific humidity at 850 hpa (nceps850gl), 500 
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hpa geopotential height (ncepp500gl), and surface airflow strength (ncep_fgl) 

were the most appropriate parameters to predict future scenarios. Moreover 
they reported that comparison of mean monthly minimum temperature of gen¬ 
erated scenarios with base period resulted in 1.1-11.2% increase in minimum 
temperature for A1B climate forcing condition. 

Yhang et al. [2017] implemented an integrated approach comprising of Dy¬ 
namic and Statistical Downscaling approaches. They have tested their ap¬ 
proach on East Asian summer monsoon precipitation data in order to obtain 
finely resolved data. The data-set acquired by the researchers covers a period 
of more than 57 years for monsoon Asia, the Middle East, and northern Eura¬ 
sia. It is available on 0.5° x 0.5° and 0.25° x 0.25° grid mesh. They concluded 
that the integrated downscaling approach produced the better results in time 
and space in comparison to individual approaches. Smid and Costa [2017] car¬ 
ried out detailed review of downscaling approaches with application for impact 
studies in urban areas. They concluded that regression methods are the most 
suitable methods for climate change studies. AM et al. [2017] have developed 
a R-package namely spdownscale for statistical downscaling the climate data 
using quantile-quantile bias correction technique. 

Pahlavan et al. [2017] proposed improved statistical model based on multi¬ 
ple linear regression (MLR) for carrying out statistical downscaling of monthly 
precipitation. They have termed their model as Monthly Statistical DownScal- 
ing Model (MSDSM). Moreover authors developed MSDSM based upon the 
general structure of Statistical DownScaling Model (SDSM). In order to en¬ 
hance the efficacy of the model, they have adopted statistical modifications 
that incorporate bias correction using variance correction factor (VCF) to re¬ 
fine the pattern of computed variance. Authors demonstrated the efficacy of 
MSDSM through its application to 288 rain gauge stations scattered in dif¬ 
ferent climatic zones of Iran. They concluded that MSDSM is comparatively 
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more suitable for reproducing the long-term mean and variance of monthly 
precipitation in comparison to SDSM. 

Stennett-Brown et ah [2017] utilized the Statistical Downscaling Model 
(SDSM) to study the future projections of daily rainfall extremes for 39 sta¬ 
tions and minimum and maximum temperature extremes for 45 stations in 
Caribbean and neighbouring regions. They have the reported that their de¬ 
veloped models proved to be suitable for predicting the monthly mean daily 
temperatures and the frequencies of warm and cool days and nights between 
years 1961-2001. Authors have predicted variations in warm (cool) days and 
nights by years 2071-2099 under the scenarios A2 and B2 in comparison to 
years 1961-1990. However their developed models for rainfall demonstrated 
lower ability generally to model the monthly climatic trends of mean of daily 
rainfall and the spatial distribution of the average of yearly maximum num¬ 
ber of consecutive dry days and mean yearly count of days with daily rainfall 
exceeding 10mm. 

Hanel et al. [2017] developed a R package namely “musica” that provides 
functionality for validating statistical downscaling methods at multiple time 
scales and other statistical downscaling methods. The musica package is used 
to verify simulated runoff trends. Authors have demonstrated that conven¬ 
tional downscaling approaches results in significant biases in simulated runoff 
at all time scales for a given AOl. 

Sa’adi et al. [2017] carried out investigation of the variation patterns of the 
rainfall owing to climate change of Sarawak Region in Borneo Island using sta¬ 
tistical downscaling of GCM projections. Authors have used observed rainfall 
data to downscale the future rainfall from ensembles of 20 GCMs of Coupled 
Model Intercomparison Project phase 5 (CMIP5) for four Representative Con¬ 
centration Pathways (RCP) scenarios, namely, RCP2.6, RCP4.5, RCP6.0 and 
RCP8.5. Model Output Statistics (MOS) based downscaling models were de- 
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veloped using two data mining approaches known as Random Forest (RF) and 
Support Vector Machine (SVM). Authors reported that SVM is able to down- 
scale all GCMs with normalized mean square error (NMSE) of 48.2-75.2 and 
skill score (SS) of 0.94-0.98 during validation. Authors concluded that rainfall 
is varying owing to the affect of the monsoon season. 

Onyutha et al. [2016] carried out comparison of outputs of three statisti¬ 
cal downscaling methods viz. simplified, change factor and advanced (wetQP) 
quantile-perturbation-based methods for future prediction of rainfall on basis 
of daily rainfall for the period 1961-2000 at nine meteorological stations in 
the basin of Lake Victoria in Eastern Africa. Authors have used 14 GCMs 
from CMIP3 from 07 GCMs from CMIP5 for comparative analysis. Authors 
reported that outputs from the three SD methods are suitable for simulating 
the patterns of monthly rainfall totals. Moreover, the projected changes of 
seasonal or annual rainfall totals from the evaluated approches exhibit nearly 
same pattern with subtle differences. However authors reported the conspic¬ 
uous differences in the results from the Delta, simQP and wetQP approaches 
in relation time series of quantiles of rainfall. 

Vu et al. [2016] used the artificial neural network technique to statistically 
downscale GCMs at meteorological site locations in Bangkok. For implement¬ 
ing their proposed approach, authors used the large-scale predictor variables 
derived from GCM dataset namely CGCM3, ECHAM5 MIROC medium res¬ 
olution (medres) and MPI-ESM-MR and local daily precipitation data for the 
period 1980-2000. The predictors are first selected over different grid boxes sur¬ 
rounding Bangkok region and subsequently the screening of the predictors was 
carried out them using PGA in order to select suitable correlated predictors for 
ANN application. They concluded are statistically downscaling techniques are 
computationally inexpensive and can easily be applied to analyse the output 
data from different GCM dataset. 
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George et al. [2016] implemented a statistical downscaling using local poly¬ 
nomial regression for developing the future climate projections of rainfall in 
a selected catchment in Kerala, India. Authors have assessed their developed 
model in comparison to compared with MLR amd ANN models. Authors 
concluded that the local polynomial regression model performs better in fore¬ 
casting the rainfall of the area of study. 

Sigdel and Ma [2016] applied the SDSM for downscaling the precipitation 
data in the three climatic regions of Nepal. Authors carried the calibration 
of downscaling model using large-scale atmospheric analysis data provided by 
National Centers for Environmental Prediction (NCEP). For the validation of 
the model the outputs of downscaled scenarios A2 and B2 of the HadCM3 
model has been utilized. On basis of validation of their developed model, au¬ 
thors reported that the average R2 value period was 0.84 and SDSM is suitable 
for simulating precipitation. Moreover, their calibrated model demonstrates 
the better performance over humid region in comparison to subhumid and arid 
regions. 

Poggio and Gimona [2015] presented an approach to downscale climate 
models based on a combination of Generalised-additive-models and geostatis¬ 
tics. They have evaluated the effectiveness of integrated approach using monthly 
means of temperature and rainfall to predict soil moisture condition. The cli¬ 
mate model data was downscaled using integrated methods combining gener¬ 
alized additive models (GAMs) and kriging in order to reproduce the spatial 
pattern. The downscaled climate model data were subsequently corrected for 
bias using interpolated ground station data (1961-1999). Authors had imple¬ 
mented their developed approach using open-source software namely GRASS 
and R. 

Sachindra et al. [2014a,b] presented the two statistical downscaling models 
for a precipitation station located in Victoria, Australia. Authors developed 
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the first model on the basis of National Centers for Environmental Prediction/- 
National Center for Atmospheric Research (NCEP/NCAR) reanalysis outputs 
and the second model on basis of outputs of Hadley Centre Coupled Model 
version 3 GCM (HadCM3). Authors have applied the multi-linear regression 
(MLR) technique for development of the model for a precipitation station sit¬ 
uated in Victoria, Australia. Authors reported that during both calibration 
and validation, these models under-predicted the high precipitation values and 
over-predicted near-zero precipitation values. Moreover, bias correction tech¬ 
nique has been applied by the authors to correct the raw outputs of model 
developed using HadCM3. 

Jha et al. [2013] presented a downscaling approach based on multiple-point 
geostatistics (MPS) to downscale climate variables viz. skin surface tempera¬ 
ture, soil moisture, and latent heat flux. They have implemented the geostatis- 
tical approach of direct sampling (DS) based upon the MPS to sample spatial 
patterns within training images to develop the downscaling model at different 
spatial scales. On basis of data derived from a RCM of the Murray-Darling 
basin in southeast Australia with 50 and 10 km spatial resolutions, authors 
carried out assessment of their developed approach. Authors analysed that 
downscaling results and concluded that results from their developed approach 
are in good agreement with the spatial distribution of WRF reference variables 
at a finer spatial scale for the surface temperature and heat flux irrespective 
of seasons. However, on basis of analysis of downscaled results for soil mois¬ 
ture, authors concluded that DS approach is not suitable for reproducing soil 
moisture for small-scale land surface features, specifically for water bodies like 
canals, lakes, and reservoirs. 

Wilby and Dawson [2013] evaluated the application of widely used soft¬ 
ware namely Statistical DownScaling Model (SDSM). Authors have discussed 
the underlining conceptual and technical evolution of SDSM. Moreover, they 
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carried out independent assessments of model capabilities of SDSM. On basis 
of review of related studies using SDSM, they concluded that SDSM provides 
reliable estimates of seasonal precipitation totals, extreme temperatures and 
areal and inter-site precipitation behavior. However SDSM produces less sat¬ 
isfactory results for frequency estimation of extreme precipitation amounts 
during summers. Further they reported that SDSM is unable to downscale 
simultaneously to multiple sites, however the basic model may be extended 
for downscaling for multiple sites through process of resampling [Wilby et al., 
2003], 

Guo et al. [2012] proposed an Automated statistical downscaling (ASD) 
method for carrying out the prediction of the daily precipitation of 138 meteo¬ 
rological stations in the Yangtze River basin for year 2010-2099 using regression 
based statistical downscaling approach. They have used HadCM3 model out¬ 
puts of A2 and B2 scenarios. They concluded that their developed method 
was suitable simulating the amount and the change pattern of precipitation. 

Raje and Mujumdar [2011] carried out comparison of outputs of three 
downscaling approaches viz. conditional random held (CRF), k-nearest neigh¬ 
bor (KNN), and support vector machine (SVM) for downscaling point-scale 
daily precipitation in the Punjab region, India, at six locations only for the 
monsoon regime. They concluded that CRF and KNN performed subtly better 
than SVM. 

Ashiq et al. [2010] utilized the different interpolation models in GIS en¬ 
vironment to generate fine scale (250 x 250m 2 ) precipitation surfaces from 
PRECIS precipitation data. Authors concluded that the multivariate exten¬ 
sion model of ordinary kriging that utilizes the data is the appropriate model 
for downscaling the precipitation data during mansoon season. 

Goyal and Ojha [2010] evaluated different linear regression-based downscal¬ 
ing models like forward, backward, direct, and step-wise regression for down- 
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scaling mean monthly precipitation in the arid Pichola watershed, India. They 
concluded that direct regression-based downscaling model performs better in 
comparison to other regression techniques for that investigated region. 

Page et al. [2009] developed a software package namely dsclim is based 
upon the statistical downscaling methodology proposed by Boe and Terray 
[2008]. "dsclim" was developed on the basis of concepts of climate regime and 
large-scale circulations (LSC). Huth et al. [2008] compared the linear and non¬ 
linear statistical downscaling methods using daily temperature data of eight 
stations in Europe for winters. They have compared linear regression of grid- 
point values (point-wise regression), regression based on principal components 
analysis and artificial neural networks. They have compared the downscaling 
methods on basis of fit to observations, shape of the statistical distribution in 
terms of its skewness and kurtosis, temporal autocorrelations with 1 day lag, 
and interstation correlations. They concluded that pointwise linear regression 
outperforms all all the methods. 

Hessami et al. [2008] developed an automated statistical downscaling method¬ 
ology (ASD) using a backward stepwise regression for predictor’s selection. 
Outputs from the third version of the coupled global Hadley Centre Cli¬ 
mate Model (HadCM3) and first generation Canadian Coupled Global Climate 
Model (CGCM1) were used by the research to validate ASD over the period 
of 1961-1990 and downscaled models results are compared with observed tem¬ 
perature and precipitation from 10 meteorological stations of Environment 
Canada located in eastern Canada. Authors reported that in comparison to 
downscaling precipitation, ASD is more effective in downscaling the tempera¬ 
ture. 

Fealy and Sweeney [2007] proposed the downscaling technique based upon 
the generalized linear modeling approach that overcomes some of the difficul¬ 
ties encountered for prediction of daily precipitation. They have tested their 
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developed method to predict the precipitation amounts for a selection sites in 
Ireland. 

Tripathi et al. [2006] formulated various SVM-based downscaling models 
in order to downscale monthly precipitation at various meteorological subdi¬ 
visions (MSDs) in India. They concluded that the SVM-based model is more 
effective in downscaling the monthly precipitation in comparison to conven¬ 
tional ANN-based downscaling model. 

3.11 Concluding Remarks 

Global Climate Model (GCMs) can replicate the global and or continental-level 
of climate scenarios plausibly well. However, they still deficient in producing 
fine resolution dataset required for climate studies at local level (from example 
a river basin). Various approaches have been developed by the researchers in 
order to bridge the gap between climate and climate impact models. These 
approaches range from simple downscaling approaches like delta method to 
complex dynamic downscaling approaches that utilize the Regional Climate 
Models (ROMs). ROMs provide physical parameters or mathematical mod¬ 
els at the regional scale. During last few decades, there has been a remark¬ 
able development in generation of ROMs and their capabilities to replicate the 
present-day climate at regional level. However factors like the inherent system¬ 
atic errors that exist in GCMs dataset act as boundary conditions to ROMs, 
the high computational time, and the requirement for further downscaling for 
impact studies hamper the application of ROMs for downscaling the GCM 
data effectively [Wilby et ah, 1998]. 

Simple downscaling methods ranging from simple scaling to intricate dis¬ 
tribution mapping have been developed and implemented for downscaling in 
the last decade [Chen et ah, 2012, 2013, 2011; Chen, 2012; Iizumi et ah, 2011; 
Lafon et ah, 2013; Mpelasoka and Chiew, 2009; Piani et ah, 2010; Ryu et ah, 
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2009; Salvi et ah, 2011; Sharma et al., 2007, 2009; Smitha et al., 2018; Terink 
et al., 2010; Teutschbein and Seibert, 2012; Teutschbein et al., 2011]. 

However, it is imperative to assess and carry out comparision of perfor¬ 
mances for a specific impact study. Chen et al. [2013] compared the perfor¬ 
mance of six bias correction methods for hydrological modeling for ten North 
American river basins. They concluded that bias correction methods per¬ 
formance depends upon the location and that a careful investigation must be 
carried out specifically for studies over the new areas of interest (AOI). Walton 
et al. [2017] concluded that both BCSD and BCCA do not perform satisfactory 
for downscaling temperature in the Sierra Nevada. Stoner et al. [2013] reported 
that simple downscaling methods do not yield satisfactory results for predict¬ 
ing extreme events pertaining to temperature and precipitation. Further they 
reported that simple methods are suitable for downscaling monthly/annual 
means, daily output for tropical regions. Regarding downscaled temperature 
and precipitation, the statistical approaches are comparatively more proficient 
in generating the temporal and spatial autocorrelation properties. 

As the advancement of the simple methods, statistical downscaling meth¬ 
ods based on the statistical relationship between large-scale climate parameters 
and local scale or regional level parameters like temperature or precipitation 
can be used. Such statistical methods are based on the argument that statis¬ 
tical relationships between the predictors and predictand remain tenable for 
future times as well. However, statistical downscaling can be used only at the 
locations where precipitation or temperature observations records are available 
at point and (or) grid levels. 

On basis of comparative analysis carried out by the investigators, it can be 
concurred that statistical and dynamical downscaling methods exhibit similar 
performances [Murphy, 2000; Sun et ah, 2015; Tang et ah, 2016a; Walton et ah, 
2015; Wood et ah, 2004; Yhang et ah, 2017]. 
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Murphy [2000]; Walton et al. [2015]; Wilby and Wigley [2000]; Wood et al. 
[2004] concluded that performance of statistical and the dynamical approaches 
are similar. Likewise, Wood et ah [2004], who investigated the six different 
downscaling methods in context of their effectiveness to simulate tempera¬ 
ture and precipitation trends, did not observe any additional improvement for 
dynamical downscaling approaches. Wilby and Wigley [1997] discussed the 
merits and demerits of the statistical and dynamical downscaling techniques. 
They concluded that the dynamic downscaling method, which is based on 
a physically consistent process are able produce fine resolution data at local 
scale, however its efficacy relies upon the bias in GCM boundary conditions 
and the effect of regional forcing phenomena. Also, its use is restricted by its 
computational cost. Khan et al. [2006a] evaluated three statistical downscaling 
methods, Statistical DownScaling Models (SDSM), LARG-WG, and Artificial 
Neural Network. They compared their potential in terms of reproducing obser¬ 
vations and concluded that SDSM proved to be the most appropriate method 
for downscaling. On basis of comparative analysis of six downscaling methods 
by Chen et al. [2012]; Chen [2012], the authors reported the apparent differ¬ 
ences in projections derived from different downscaling methods. Smid and 
Costa [2017] carried out detailed review of downscaling approaches with ap¬ 
plication for impact studies in urban areas. They concluded that statistical 
downscaling approaches based on regression are the most suitable in context 
of climate change studies in urban areas. 

In the next chapter, the mathematical background for implementing sta¬ 
tistical downscaling based on regression analysis is discussed in detail. 
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Chapter 4 

Mathematical Background 

4.1 General 

Steps associated with downscaling are based on mathematical concepts of cor¬ 
relation and regression analysis. Underlying steps of Efficient Multi-site Statis¬ 
tical Downscaling Model (EMSDM) implement these concepts for downscaling. 
In this chapter, these mathematical concepts are discussed in detail. 

4.2 Predictor 

In statistics, for the independent variable, a more contemporary term used 
for the variable is termed as predictor. In this research work, for statistical 
modelling large-scale variables used as input data are known as predictor, 
which described the circulation pattern over a spatial region. Contextually it 
is referred with many terminology like “input variable”, “independent variable” 
or as “large-scale variable”. 

Mathematically, if it is a single series then it is referred as uni-variate and 
multi-variate in case of several parallel series. Mathematical symbology for 
uni-variate and multi-variate predictors are represented using the following 
notations as given in the table 4.1. 
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Table 4.1: Type of predictors and notation 


Predictor Type 

Mathematical Symbol 

Uni-variate Predictor 

X 

Multi-variate Predictor 

X 


4.3 Predictand 

In statistics, for the dependent variable, a more contemporary term used for 
the variable is termed as predictand. In this research for statistical model¬ 
ing, small-scale variable used as output data is known as predictand, which 
represents the climate variable measured at climate station. Contextually it is 
referred with many term like “output variable”, “dependent variable”, “respond¬ 
ing variable”, “response variable”, “small-scale variable” or as “regressand”. A 
very common definition used in many places for it is like “That which is to be 
predicted”. 

In mathematical equation it is the left hand side term of the equation. The 
mathematical relationship between the predictand and predictors is like given 
below: 

Predictand^) = f (Predictor (s)) (4.1) 

Similarly like the predictor, mathematically, if it is a single series then 
it is referred as uni-variate and multi-variate in case of several parallel se¬ 
ries. Mathematical symbology for uni-variate and multi-variate predictands 
are represented using the following notations are given in the table 4.2. 


Table 4.2: Type of predictands and notation 


Predictand Type 

Mathematical Symbol 

Uni-variate Predictand 

y 

Multi-variate Predictand 

Y 


54 





Equation (4.1) is a general mathematical relationship between predictands 
and predictors. Specific relationship between predictands and predictors are 
based on their uni-variate and multi-variate characteristics are like the equation 
discussed in subsequent sections. 

y = f(x) (4.2) 

V = f(X) ( 4 - 3 ) 

Y = f(X) (4.4) 

Interconnection of local observations and large-scale circulation patterns is 
given by the equation 4.2, which is a simple linear statistical relationship [Von 
Storch et al., 1993, 1997]. Equation 4.3 represents the interconnection of lo¬ 
cal observations and large-scale circulation patterns. This equation represents 
the multiple linear regression equation. Equation 4.4 stands for a multivariate 
linear regression. In the equations 4.2 to 4.4, the predictands and predictors 
comprise of numerous observations and are represented by vectors (uni-variate) 
or matrices (multi-variate). For multivariate predictand, the matrix Y to refer 
to the time series of y(t), where columns of the matrix Y are vectors. Mathe¬ 
matical representation of matrix fields Y and X comprise of n measurements 
as given by following equations: 


y = [yi,yi,-y n \ 

(4.5) 

X = [xi,xi, ...x n \ 

(4.6) 


In climate research, more than one climate variable (predictors) are used for 
the statistical downscaling. Downscaling uni-variate relationship of cimate 
variable (predictor) in downscaling process mostly produce less reliable climate 
projections, on the other hand the reliability of the multi-variate relationship 
of climate variable (predictors) in downscaling process generate more reliable 
climate projections. 
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4.4 Correlation 


Correlation is the dependancy in form of a statistical relationship between two 
data-sets (random variable). Correlation analysis is statistical method, which 
is commonly used to determine the statistical relationship and the direction of 
the relationship between two data-sets. Hence, it is also referred as a bivariate 
correlation analysis in the statistics. Statistical relationship determined by the 
correlation analysis between two data-set is a linear statistical relation. 

Correlations are useful because they can indicate a predictive relationship 
that can be used in practice. For example, precipitation in mountain region 
is based on the correlation between precipitation and humidity. A underlying 
relationship between humidity and precipitation in hilly regions causes high 
precipitation. Generally, the existence of a correlation is not sufficient to con¬ 
clude the presence of a underlying relationship (i.e., correlation does not give 
the guarantee). Probabilistic independence between two data-set is essential 
mathematical property otherwise the data-sets show dependent mathematical 
nature, which make them unsuitable for the downscaling. 

4.5 Correlation Coefficient 

Strength (degree) of the correlation is represented by correlation coefficient. 
The value of the correlation coefficient is between,+1 and -1. A value of ± 
1 indicates a ideal degree of relationship between the two variables. As the 
correlation coefficient value tends towards zero, the relationship between the 
two variables becomes weaker. The direction of the relationship is designated 
by the sign of the coefficient; a positive sign refers a proportional relationship 
and a negative sign refers to inverse proportional relationship between the vari¬ 
ables. Oftenly correlation coefficient is denoted by the mathematical notation 
r or p. 
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Usually, in statistics, correlation coefficient is measured by Pearson cor¬ 
relation coefficient, Kendall rank correlation coefficient and Spearman’s rank 
correlation coefficient. Pearson correlation coefficient is the most common 
among them. Pearson correlation coefficient is responsive only for a linear re¬ 
lationship between two data-sets, while other two are more responsive sensitive 
to nonlinear relationships between data-sets 

Correlation coefficient varies between +1 and -1. A verbally description of 
the strength of the correlation as the absolute value are given below: 


Table 4.3: Correlation coefficient verbal classification 


Verbally Description 

Value 

very weak 

0.00 - 0.19 

weak 

0.20 - 0.39 

moderate 

0.40 - 0.59 

strong 

0.60 - 0.79 

very strong 

0.80 - 1.00 


4.6 Pearson Correlation Coefficient 

Pearson correlation coefficient is a parametric statistical test that is used to 
measure the degree of association between two variables (predictor and pre- 
dictand) on the basis of their values (climate measurements). It is the most 
extensively used correlation statistic. 

It measures the degree of the relationship between the two variables which 
are linearly related. For example, in the climate research, if one wants to 
measure how two variables like GHG concentration and a specified pollutant 
concentration like NO^, concentration are related to each other, Pearson cor- 
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relation coefficient can be used to estimate the degree of relationship between 
these two variables. 

For the Pearson correlation coefficient, both variables must be normally dis¬ 
tributed. Other assumptions include linearity and homoscedasticity. Linearity 
consider a straight line relationship between each of the two variables and ho¬ 
moscedasticity consider that data is equally distributed about the regression 
line. 

The point-biserial correlation coefficient is used to estimate relationship 
when one of the variables is dichotomous. It is a specialized variant of Pearson 
correlation coefficient and used to assess the strength and direction of the asso¬ 
ciation that exists between a dichotomous variable and a continuous variable. 
The following formula is used to calculate the Pearson r correlation: 

,, = El‘,i (xk-x)(yk-y) IA n 

- WWk - W 

For the development of the computer program, Equation 4.7 can be ex¬ 
panded as: 


E n x~^n \~^n 

_ _ fe= 1 X kUk ~ Z^fc=1 X k 2^k= 1 Vk _ 

2 EL, 4 - (EiU **) V" EL, vl - (EL, ») 2 

where: r = Pearson Correlation Coefficient 

N = Number of observations 

22 xy = summation of the products of paired measures 

22 x = summation of x measures 

22 y = summation of y measures 

22 x 2 = summation of squared x measures 

22 y 2 = summation of squared y measures 



4.7 Spearman’s Rank Correlation Coefficient 


Spearman’s rank correlation coefficient is a non-parametric test that is used 
to measure the degree of association between two variables on the basis of 
their ranks. Pearson correlation coefficient between set of ranked variables 
derived from the given set of variables on the basis of their value, is known 
as Spearman’s rank correlation coefficient. Spearman’s rank correlation co¬ 
efficient is the base for nonparametric tests. Nonparametric tests refer to 
statistical testing methods in which it is not essential for the data to follow 
a normal distribution. The data in nonparametric tests are generally ordinal 
in nature that means that data do not rely on numbers, but rather on their 
ranking in dataset. 

Set of variables are X and Y, representing predictors and predictands re¬ 
spectively. Their corresponding ranked variable sets are represented below. 

Ranking (X) = X 
Ranking (Y) = Y 

Spearman’s Rank Coefficient is determined using following equation 

CWX,Y) 

r s — Px, Y — - 

O'xO'Y 

where: r s = Spearman’s Rank Correlation Coefficient 

px ,y = Pearson Correlation Coefficient of ranked variables. 

Cov(K, Y) = Covariance of the rank variables. 

<Tx and <ty = Standard deviations of the ranked variables. 

If the ranked values of the two variables for a set of n measurements are Xk 
and ijk, with dk = Vk — %k , then the Spearman’s rank correlation coefficient is 


( 4 . 9 ) 
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defined as 


r s = P 


6EL1 4 

n(n 2 — 1) 


(4.10) 


where: r s = Spearman’s Rank Correlation Coefficient 

dk = Difference between the ranks of corresponding variables 
n = Number of Observations 


Where notation r s denotes Spearman’s rank correlation coefficient between 
the rankings x and y. The assumptions of the Spearman’s rank correlation 
coefficient are that the measurements on one variable must be monotonically 
related to the other variable and data must be ordinal. 

For this research work, guidelines given by Yu et ah [2017] have been used to 
investigate the degree of the relationship between GCM outputs using Spear¬ 
man’s rank correlation coefficient. The value of correlation coefficient between 
0 and 0.40 represent a low linear association, coefficients between 0.40 and 0.70 
represent a significant association, and value of correlation coefficient between 
0.7 and above represent a large association or relationship. 
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4.8 Multiple Linear Regression 


4.8.1 Mathematical Model 

Multiple linear regression (MLR) is a generalized extension of simple linear 
regression to the case in which there are various predictors for the development 
of the model. For example, for two predictor variables X\ and X 2 , and one 
predictand y, the regression model can be expressed as: 


y — oq + ol 1X1 + ol 2 X 2 + e 


(4.11) 


Above equation given for example comprises of a deterministic component 
involving the three regression coefficients (ckq, ol\ and a 2 ) and e involving the 
residual terms. 

As equation (4.11) is a representation by bi-variate linear regression model, 
likewise the general equation for the multiple linear regression (MLR) model 
is represented like equation (4.12) is presented below: 


V — ol o + a k x k + e 


(4.12) 


k =1 

where: y = Predictand (climate variable) 
x k = k th Predictor 
a k = k th Regression coefficient 
n = Number of predictors 
e = Residual terms 

The matrix representation of the multiple linear regression model with 
resemblance of the (4.12) is represented like: 


y = X ol + e 

(lx (n+l))((n+l) x 1) 


(4.13) 
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y = 


1 X\ X 2 • • • x r , 


ot 0 

(X i 
0^2 

CXt). 


+ 6 


(4,14) 


The predictor variables can be either continuous or categorical. In the case 
of the categorical type, these variables are required to be encoded as dummy 
variables. The dependent variable y must be measured on a continuous scale. 
The residual terms represent the difference between the predicted and observed 
values (predictors) of measurements (A*,). These terms are assumed to be 
independently and identically distributed with zero mean and variance, and 
account for natural variability as well as maybe measurement error. 


4.8.2 Model Assumptions 

Following are the assumptions for multiple linear regression 

1. There must be a linear relationship between predictand and predictors. 
Scatterplots can be helpful for revealing the linear/curvilinear relation¬ 
ship. 

2. The residuals are normally distributed. 

3. The independent variables should not be highly correlated with each 
other. Henceforth, independent variables do not exhibit multicollinearity. 

4. Variances of error terms are similar across the values of the independent 
variables. This property is termed as homoscedasticity. A plot of stan¬ 
dardized residuals versus predicted values can reveal that whether points 
are equally distributed across all values of the independent variables. 
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4.9 Multivariate Linear Regression 


Multivariate linear regression (MvLR) refers to the modeling of data wherein 
an outcome is measured for the same phenomena at multiple times (repeated 
measurements), or the modeling of nested/clustered data, wherein there are 
multiple group of measurements in each cluster. A multivariate linear regres¬ 
sion model is generalized as: 


Y = aX + e 


(4.15) 


Y = 


y i 

V2 

V3 


(kx 1) 


a = 


OL\ ol 2 ■ ■ ■ ot r 


Xq 

X\ 

X2 


J (lx(n+l)) 


X = 


e = 


X 

L ,lJ ((ra+l)xl) 

e l 

e 2 

£3 


(pxl) 


(4.15a) 


(4.15b) 


(4.15c) 


(4.15d) 
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Xo 


ei 

V2 



X\ 


e 2 

V3 

~ OL 0 OLi OL 2 ■ ' 

CX n 

X 2 

+ 

e 3 

JJp_ 





Ep 


Vi 



e l 

V2 



e 2 

V3 

= olqXq + aqaq + ct 2 x 2 + • • 

“l - Oi n X n 

e 3 

jjp_ 



Sp 


Y — C^qXq “h CX\X i + (X.2%2 H“ • • • H“ “1“ £ 

n 

y - y + e 

fc =0 

where: F = Predictand set with k predictands : ( 2 / 1 , 2 / 2 , ■ ■ ■ ,Vp) 

Xk = k th Predictor 

ck/c = Regression coefficient set with k elements 
e = Residual set with p elements 
n = Number of predictors 

Equation (4.19) is a set of p elements. Each element has its own model 
tion, which is a multiple linear regression model. 




OL 10 

api 

«12 • 

. . O^ln 



OL 20 

«21 

«22 • 

■ • Ot2n 

a - CK 0 «1 «2 • 

Oi n 

«30 

«31 

«32 • ■ 

' • ^3 n 




Qfpi 

a p2 . , 

. . CXpn 


(4.16) 


(4.17) 

(4.18) 

(4.19) 


equa- 


(4.20) 
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As expressed in equation (4.15b), at is set of (n + 1) elements which are 
itself set of p elements. This is expressed in the equation (4.20). 

By the combining the equation (4.16) and the equation (4.20), following 
equation can be obtained: 


2/i 


OL 10 

Q(ll 

OL 12 ■ 

Ot\n 


Xo 


Cl 

2/2 


0(20 

0(21 

0(22 • 

^2 n 


Xi 


e 2 

2/3 

— 

0(30 

OL 31 

OL 32 • 

n 


X 2 

+ 

C3 

2/p 


OLpO 

OLp 1 

Q(p2 • 

&pn 


•^n 


e P 


(4.21) 


By solving the equation (4.21), the set of solution is obtained as: 


y 1 = (XiqXq + CKnXi + OL 12^2 + 

2/2 = OL 2qXq + 0(21X1 + 0 : 22 X 2 + 

2/3 = Q(30Xo + Q(3lXl + 0(32X2 + 

2 Ip Oip 0 X 0 T otp\X\ T Q(p2X2 T ■ ■ ■ T cxp n x n T Cp 

yj = oyoXi + oij\X\ + 0 ^ 2 X 2 + • • • + oij n x n + €j (4.23) 

n 

2 lj = ^2 a 3 k - Xk + e i ( 4 - 24 ) 

k =0 

Above presented equation (4.21) and equation (4.22) are the matrix repre¬ 
sentation of multi-variate linear regression (MvLR) model, which are suitable 
for their implementation using computer program. 

Equation (4.23) and equation (4.24) are referring that MvLR model can 
be obtained by applying MLR model solution on predictand set (Y) with p 
elements ( 2 / 1 , 2 / 2 , — ,2/p)- 


• • • + Ol\ n X n + 6 i 

• • • + CX.2n.Xn + ^2 

• ' ' + Ol3 n X n + 63 (4.22) 
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4.10 Model Solution 


General multiple linear regression model is represented as equation (4.12). 
there are n predictors, (n + 1 ) regression coefficients and one residual term. 

MLR models are often used as empirical models or approximation func¬ 
tions. This is the reason that the true functional relationship between y and 
X\, x 2 , • • •, x n is unknown. 

Complex models may often be analyzed by MLR technique like in equation 
by binomial model and in equation by quadratic polynomial model. 

y = a o + ot\X + o 2 x 2 + e (4-25) 

y = o 0 + a\X + o 2 x 2 + o 3 x 3 + o 4 x 4 + e (4.26) 

Equation (4.25) can be rewritten like 

y = o 0 + otixi + a 2 x 2 + e (4.27) 

By taking the assumption X\ = x and x 2 = x 2 . 

And equation (4.26) can be rewritten like 

y = ao + a 4 x\ + a 2 x 2 + a 3 x 3 + 0 : 4 X 4 + e (4.28) 

By taking the assumption x\ = x, x 2 = x 2 , x 3 = x 3 and x 4 = x 4 . 

Models that have interaction between predictors can also be analysed by 
MLR methods. For example models that are like model represented in the 
equation below: 

y = o 0 + 01 X 1 + o 2 x 2 + 012 X 1 X 2 + e (4.29) 
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This model given in equation (4.29) can be rewritten like 

y = a 0 + aiXi + a 2 x 2 + a 3 x 3 + e (4.30) 

By taking the assumption x 3 = x±x 2 and a 3 = a± 2 . 

Which is finally a linear regression model equation. Same thing can also be 
achieved with higher polynomial order models, multi-interaction models and 
combination of higher polynomial order and multi-interaction models. In end 
a final multiple linear regression model equation is obtained. 

4.10.1 Estimation of the model parameters by least-squares method 

Regression coefficients can be estimated by least-square method. If there are 
m number of observations and n number of predictors (regressors), then 


m > n 

(4.31) 

y = {yi,y 2 ,---,ym} 

(4.32) 

*^*2) • • ■ j %im.} 

(4.33) 


and also e in the model has uncorrelated errors and 

E(e) = 0 (4.34) 

Var(e) = p 2 (4.35) 

As regression data come from the climate observation study, most of the pre¬ 
dictors (regressos) will be random variables. So observations of each predictors 
will be independent and regression coefficients (a) or variance (a 2 ) will not 
affect the distribution. 
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By assuming that the mean of («o + 0 : 1*1 + • • • + ct n x n ) and variance 
(cr 2 ) be normal with predictors {x\,x 2 , ■ .. ,x n ) imply that predictand (y) is 
conditionally distributed. However, these variables are set of observations and 
contain m obervations. So by considering the simple multiple linear regression 
model equation for each observation are like: 


Xjj — OIq + QtiXji + Ol 2 Xj 2 + ' ' ' + Ol n Xj n + 6j 
n 

= «0 + OlkXjk + €j 
k= 1 


(4.36) 


where: j = 1,2,... ,m 


ej =y j -g 0 - a k x jk (4.37) 

fc=i 

The least-square function is formulated like given below in equation : 

m 

S(a 0 ,a 1 ,...,a n )=^2e 2 j (4.38) 

3 = 1 

By applying equation (4.37) to equation (4.38) the new derived equation 
is : 


m / n \ 2 

^(^ 0 ? ^ 1 ) • • • ? ^n) ^ ^ f Vj ^0 ^ ^ &k%jk j ( 4 . 39 ) 

j =1 ' /c=l ' 

S' is a minimization function with respect to (ao,ai,... ,a n ). As for the 
minimization: 


(] S_ 

d 


a 0 


= 0 


ao,ai,...,a n 


(4.40) 


By applying condition of equation (4.40) to the equation (4.39) a new 
derived equations are: 
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n 


(4.41) 




ao,ai,...,an 


?E 

3 =1 


Vj ^ ^ (%k%jk j 0 

k= 1 ' 


and 


dt 


= 2 




m 

E 

i=i 


Vj ~ «o 


n 

E 

fc=i 


(Xk%jk \ %jk d 


(4.42) 


Solution obtained by solving equation (4.41) and equation (4.42) like equa¬ 
tion set (4.43) described next. 

Equation (4.43) comprises of n + 1 sub-equations one for each regression 
coefficients. The solutions for sub-equations will be the least square estimators 
for a o, cki, CK 2 , • • •, a n - For the computation purpose, the matrix notation of the 
multiple linear regression model is more suitable. 
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Matrix notation of the multiple linear regression model in context of data 
(observation) is represented like equation given below: 

y = X a + e (4.44) 

(mxl) (mx(n+l)) ((n+l)xl) (mxl) 

The element of equation (4.44) are represented using the following matrices: 

Vi 

V2 

2/ = 2/3 (4.44a) 

y 

mxl 



1 

x u 

Xl2 ■ ■ 

. . X\n 


1 

X21 

X 2 2 ■ - 

• • %2 n 

X = 

1 

x:n 

X 3 2 ■ ■ 

.. x 3n (4.44b) 


1 

Xml 

X m 2 ■ • 

• • Xmn 


J mx(n+l) 

OL 0 

Oi\ 

2 (4.44c) 

n 

J (n+l)xl 
Cl 

e 2 

c 3 (4.44d) 

J mx 1 
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In the equations set (4.44), predictand (y) has m vectors as climate obser¬ 
vations, predictor set (X) has n + 1 elements as predictors (x 0 , aq, x 2 , ■ ■, x n ), 
where xq = I mx 1 and rest of predictors have m vectors as climate observa¬ 
tions. Regression coefficient a is a matrix with dimensions (n + 1) x 1 and 
has (n + 1) vector of regression coefficients. Residual (e) is a matrix with 
dimensions (m x 1) and has (m) vector elements as residual terms. 

For finding the minima of the vector of least square estimator ot the solution 
is like given below: 


5(a) = E e 


j =i 


As 


E 

j=i 


s- = e e 


where: 


e' = Transpose ^) 


(4.45) 


(4.46) 


(4.47) 


and 


e = y — Xa 


(4.48) 


By applying equations (4.46), equation (4.47) and (4.48) to the equation 
(4.45): 


5(a) = (y- Xa)\y - Xa) (4.49) 

S(a) = y'y - a'X'y - y'Xa + a'X'Xa (4.50) 

{a'X'y)' = y'Xa (4.51) 

Equation (4.51) depicts algebraic property (.M 1 M 2 ) / = Ml 2 M[ of matrix 
operations, and a'X'y is a 1 x 1 matrix or a scalar, equations (4.49) - (4.51) 
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can be represented using the following equations: 

ex'X'y = (a'X'y)' = y'Xa (4.52) 

So 

S(a) = y'y - 2cx'X'y + c x'X'Xa (4.53) 

As for minimization of S(a) must satisfy the following condition 

^ =0 (4.54) 

( ’OL OL 

By applying condition of equation (4.54) on equation (4.53): 

^ = -2 X'y + 2 X'Xa = 0 (4.55) 

OL 

Which can be simplified as: 

X'Xa = X'y (4.56) 

\a = (X'X^X'y (4.57) 

1 1 1 ... 1 

X\\ X2\ Xz\ ... X ml 

Xi 2 x 2 2 X 32 ■■■ X m2 (4.57a) 

X\n X 2 n *^3n • • • %mn 

-* (n+l)xm 
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(n+l)x(n+l) 





Equation (4.57b) represent that X'X ia a (n + 1) x (n + 1) symmetric 
matrix. 

The fitted regression model corresponding to the matrix form of the pre- 
dictand y and the predictors X : 


y = la (4.58) 

By applying equation (4.57) to the equation (4.58). 

y = X(X'Xy 1 X'y (4.59) 

H = XiX'X^X' (4.60) 

II is known as hat matrix with dimension m x m. 

V = Hy (4.61) 

Now for determining the residual term: Now for determining the residual 
terms, equation (4.63) is derived as follows: 


e = y-y (4.62) 

e = y — Xol (4.62a) 

e = y — Hy (4.62b) 

e = (/ — H)y (4.62c) 
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(4.63) 


\e=(l-X(X'X)- 1 X')y 

By the data matrix shown in equation (4.44a), equation (4.44b), equation 
(4.57a) and equation (4.57b) computation of e done. These equations are very 
computational friendly for the implementation purpose. 

Aforementioned mathematical solution is for the MLR model. As the gener¬ 
alization of MLR model, MvLR model refers the modeling of nested/clustered 
data, wherein multiple group of measurements are taken. If there are p group 
of measurements are present, then above solution is applied for each group. 

Mathematical solution discussed till now is for the MLR model. MvLR 
model refers the modeling of nested/clustered data , wherein there are multiple 
group of measurements are taken. If there are p group of measurements are 
present, then above solution is applied for each group. 

4.11 Concluding Remarks 

In this chapter, underlying mathematical concepts of the statistical downscal¬ 
ing approach adopted by EMSDM have been discussed. Suitable Preditors 
(GCM output parameters) are selected on the basis of Correlation Statistics. 
These statistics are the extensively used for selecting suitable predictors from 
the available GCM outputs (predictors) Chen et al. [2012]; Meenu et al. [2013]. 
Multiple and multivariate regression techniques are widely adopted by Statis¬ 
tical Downscaling Approaches for model development [Tang et ah, 2016a; Yang 
et ah, 2017a]. 

In the subsequent chapter 5, the basic framework of Efficient Multi-site 
Statistical Downscaling Model (EMSDM) is discussed in detail. 
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Chapter 5 


Efficient Multi-site Statistical Downscal¬ 
ing Model (EMSDM) 

5.1 General 

In this chapter, proposed computational framework namely Efficient Multi¬ 
site Statistical Downscaling Model (EMSDM) has been discussed. EMSDM 
employs automation to carry out downscaling of multiple grids. Proposed 
framework can easily incorporate different GCMs for downscaling and different 
types of dataset for downscaling. In the subsequent section, underlying steps 
of the framework and their flow have been discussed. Detailed implementation 
of the framework is discussed in chapter 6. 

5.2 Framework 

Efficient Multi-site Statistical Downscaling Model is depicted in Figure 5.1. 
Owing to wide spread usage, requirement of nominal computational resources 
and possibility of extraction of site-specification information, it adopts the 
statistical downscaling approach for high resolution downscaling of given AOI. 
As conspicuous from Figure 5.1, some of steps of the EMSDM are automated 
to accomplish downscaling for all the grids or locations of a given AOI. Steps to 
carry out downscaling using EMSDM are discussed in subsequent sub-sections. 
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Figure 5.1: Framework for EMSDM 
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5.2.1 Data Modelling 


GCM and climatological data is available is different formats that are spatially 
referenced in grid form. However these dataset are not adequately structured 
for downscaling for AOI. In data modelling step, available data are structured 
suitably for the framework. Moreover, metadata of dataset used for downscal¬ 
ing is not available. Structuring of dataset in suitable spatial format along 
with its metadata is automated through development and implementation of 
computer programmes. 

5.2.2 Spatial Analysis 

In this process, spatial metadata prepared using the previous step is used 
to validate and spatially map the GCM dataset with local climate data like 
IMD data through GIS based operations like overlay, encoding of local climate 
dataset with GCM dataset etc. Spatial analysis is important for automating 
the mapping the relevant GCM dataset with the corresponding local climate 
dataset of interest. This process is useful in automating the time-intensive 
manual validation and mapping of GCM dataset with local dataset for large 
areas. 

5.2.3 Temporal Analysis 

GCM data and local climate data are available at different time scale viz. 
years. In this step, temporal mapping of the GCM dataset with local climate 
data like IMD data is carried out. Moreover statistical aggregation of daily cli¬ 
mate dataset is automated to develop the monthly, yearly and decadal climate 
dataset. Through this process monthly, yearly and decadal climate dataset is 
readily available for further applications like trend visualization, downscaling 
etc. Morover, this process is also helpful in significantly saving the computa¬ 
tional time required for preparation of data-sets. 
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5.2.4 Screening Predictors 


GCM datasets comprises of various climate parameters. For example, CanESM2 
data-set comprises of 26 climate parameters. In context of downscaling, these 
climate parameters are known as predictors. Moreover effect of these predic¬ 
tors varies geographically. Henceforth, the suitable set of predictors vary from 
grid to grid. 

In this step, non-dominating predictors are screened out from the set of 
predictors so as to obtain set of dominating predictors for downscaling process. 
The major criterion of selecting dominating predictors is that the dominating 
predictors are strongly correlated with the predictands [Grimes et ah, 2003; 
Yang et al., 2017a]. 

For the proposed framework, screening of predictors is carried out by two 
type of statistics. 

1. Parametric Statistics 

2. Non-parametric Statistics 

Generally selection of the type of statistics for determining the correlation 
coefficients requires manual intervention of the user. Extreme events acts as 
the outliers. One outlier can affect the parametric statistics by inflating the 
variance and enhance the error term. This can invalidate the conclusion drawn 
by the parametric statistics. Henceforth in case of large occurrence of extreme 
events, non-parametric statistics is more suitable for the selection of the pre¬ 
dictors. For parametric statistics, Pearson Correlation Coefficient and for non- 
parametric statistics, Spearman’s Correlation Coefficient are implemented by 
the model to estimate the correlation coefficients, which are subsequently used 
to screen out non-relevant predictors and selecting the dominating predictors. 

Threshold value for the selection of the set of predictors is automatically 
determined by an algorithm. Threshold value adopted by the algorithm is 
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based on the fact that set of selected predictors should not be the null set. 
Generally, the threshold value of correlation coefficient for screening set of 
predictors, has value more than 0.7. Moreover, factor analysis is used by the 
model for the dimensionality reduction of selected predictor set in the case 
when cardinality of the set of predictors generated from the aforementioned 
algorithm is high (for example larger than 12). 

5.2.5 Model Generation 

In this step, downscaling model is developed using Multiple Linear Regression 
(MLR) and Multivariate Linear Regression Analysis (MvLR). Screened pre¬ 
dictors and daily local station dataset (predictants) are used to develop the 
model. EMSDM can be used to develop the downscaling model using screened 
predictors and daily local climate dataset (predictants) as well as aggregated 
local climate dataset (predictants) like monthly data-sets, decadal data-sets 
etc. In this step, computationally intensive mathematical operations are ap¬ 
plied on large predictand and predictors data-sets inputted as matrices of large 
dimensions. These complex mathematical operations are implemented in ’C’ 
programming language. 

5.2.6 Time Series Analysis 

In this step, model developed in previous step is utilized to generate time series 
of the climatological parameters for future years,decades for the specified RCP 
scenario. Model is fed with the specified RCP scenario data-set and model 
coefficients to generate the time series of the climatological parameters for 
future. 
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5.2.7 Geovisualiazation of Results 


Gevisualization of results of downscaling is carried out using Web GIS applica¬ 
tion. The Web-GIS application is developed using open-source python based 
web-framework namely Django. User can query the outputs available from the 
developed Web GIS application. 

5.3 Concluding Remarks 

In this chapter, underlying steps of EMSDM has been discussed. Proposed 
framework is able to carry out multi-site downscaling for given area of interest 
(AOI). Different processes of EMSDM are automated to carryout downscaling, 
henceforth manual intervention is not required to develop downscaling model. 

EMSDM is implemented in “C” programming language. Generally, the cli¬ 
mate studies require handling of large amount of data and intensive processing. 
Application developed in “C” programming language supports fast processing 
and can efficiently handles large quantum of GCM and local data-sets. 
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Chapter 6 

Implementation 

6.1 General 

In this chapter, implementation of EMSDM is described in form for primary 
algorithms as pseudo codes. Local precipitation dataset (predictand) of India 
acquired from India Meteorological Department (IMD) and GCM dataset (pre¬ 
dictors) namely Second generation Canadian Earth System Model (CanESM2) 
acquired from Canadian Centre for Climate Modelling and Analysis (CCCma) 
have been taken for validating the EMSDM. 

As a primary climate data, precipitation data available from IMD is struc¬ 
tured as the grids with grid resolution 0.5° x 0.5°. Each grid may contain one 
or more observation stations. Temporal resolution of climate data-set obtained 
from IMD is taken as one day. Temporal duration of the climate data-set is 
from year 1971 to 2005. 

As a GCM, CanESM2 is obtained from CCCma is also structured as the 
grids with grid resolution 2.7906° x 2.8125°. Temporal resolution of predictors 
data-set obtained from this GCM is taken as one day. 

Well-known text (WKT) as a text markup language for creating spatial 
data used in this research work. Standardized formats of WKT are discussed 
in next section which are subsequently used for generation spatial grid data. 
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6.2 Well-known text (WKT) 


WKT is a text markup language. It is used for spatially representing, refer¬ 
encing and transforming the spatial data. Well-known binary (WKB) is the 
binary equivalent of WKT and is used to transfer and store the same informa¬ 
tion on databases. WKT and WKB formats were formerly defined by the Open 
Geospatial Consortium (OGC). OGC provide the specification this format in 
their Coordinate Transformation Service and Simple Feature Access specifi¬ 
cations. However, presently these formats are standardized through ISO/IEC 
13249-3:2016 and ISO 19162:2015 standards. WKT can represent the following 
types of geometric objects like (ISO/IEC 13249-3: 2016): 

1. Geometry 

2. Point 

3. LineString 

4. Polygon 

5. Multipoint 

6. MultiLineString 

7. MultiPolygon 

8. GeometryCollection 

9. Circular String 

10. CompoundCurve 

11. CurvePolygon 

12. Multi Curve 

13. MultiSurface 

14. Curve 

15. Surface 

16. PolyhedralSurface 

17. TIN 

18. Triangle 
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Table 6.1: Basic Geometry Types of WKT Format 


Geometry primitives (2D) 

Type 

Examples 

Point 

POINT(30 10) 

LineString 

LINESTRING (30 10, 10 30, 40 40) 

Polygon 

POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10)) 

POLYGON ((35 10, 45 45, 15 40, 10 20, 35 10), 

(20 30, 35 35, 30 20, 20 30)) 


Table 6.2: Complex Geometry Types of WKT Format 


Geometry primitives (2D) 

Type 

Examples 

Multipoint 

MULTIPOINT ((10 40), (40 30), (20 20), (30 10)) 

MULTIPOINT (10 40, 40 30, 20 20, 30 10) 

MultiLineString 

MULTILINESTRING ((10 10, 20 20, 10 40), 

(40 40, 30 30, 40 20, 30 10)) 

MultiPolygon 

MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)), 

((15 5, 40 10, 10 20, 5 10, 15 5))) 

MULTIPOLYGON (((40 40, 20 45, 45 30, 40 40)), 

((20 35, 10 30, 10 10, 30 5, 45 20, 20 35), 

(30 20, 20 15, 20 25, 30 20))) 
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6.3 Pre-Processing of Local Data-sets (Predictand) 


For the present research work, data-set as a predictand is acquired from IMD. 
Data-set is in form of 35 hies, each corresponding to precipitation data mea¬ 
surement from year 1971 to year 2005. Each of these 35 hies comprises of daily 
observation value for precipitation in binary format. These observational mea¬ 
surement values are assigned in form of data-matrix with dimension (65 x 69), 
where each data-matrix element is representation of a grid. Henceforth, there 
are 65 x 69 rectangular grids comprise of the precipitation data of India and 
its vicinity. Values associated with grids are of following types: 

1. For Valid value > 0. 

2. For Invalid value = —999 NULL. 

Total numbers of days from year 1971 to 2005 is equal to 12874. For each day, 
number of data points for precipitation are 4485 (= 65 x 69). 

A procedure to extract information of valid grids from the raw IMD data 
has been developed. This procedure is given in Pseudo Code 1. By applying 
this procedure, one can obtain the maximum number of grids which has the 
valid value and within the administrative boundary of India. For maintaining 
the generality of the pseudo codes, the constants Row and Col of the pseudo 
codes used for grid-matrix is not hxed to the value of grid-matrix dimensions 
IMD’s precipitation data, which are 65 for row and 69 for col. 

However some of stations were established by IMD in later years. So num¬ 
ber of these grids which are established in later years can be determined using 
procedure given in Pseudo Code 2. From the algorithms discussed in Pseudo 
Codes 1 and 2, the information regarding the minimum and maximum number 
of the grids that are established in later years and that corresponds to India 
have been acquired. In this way, the framework is flexible as well as scalable 
to handle the varying cardinality of the datasets. 



Pseudo Code 1. To extract valid grids from the raw IMD data 

1 Set Count —> 0 

2 Set Grid_Count —y 0 

3 Set File_List —» [IMD Files] 

4 foreach file in File_ List do 

5 foreach DataSet[Row]fCoL] in file do 

6 for row = 1 to Row do 

7 for col = 1 to Col do 

8 if DataSet [row][col] > 0 then 

9 Count = Count + 1 

10 end 

11 end 

12 end 

13 if Grid_ Count < 0 then 

14 Grid_ Count = Count 

is end 

16 Count = 0 

17 end 
is end 


19 Print Grid Count 





Pseudo Code 2 . To find number of grids which are established in 
later years 

1 Set Count —» 0 

2 Set Grid_Count_Min —> 0 

3 Set File_List —> |IMD Files] 

4 foreach file in File_ List do 

5 foreach DataSet[Row]fCoL] in file do 

6 for row = 1 to Row do 

7 for col = 1 to Col do 

8 if DataSet [row][col] > 0 then 

9 Count = Count + 1 

10 end 

11 end 

12 end 

13 if Grid_ Count < Grid_ Count_ Min then 

14 Grid_Count_Min = Count 

is end 

16 Count = 0 

17 end 
is end 

19 New Grids = Grid Count - Grid Count Min 


20 Print New Grids 





Procedure to extract information of number of valid grids from the raw 
IMD data has been developed. This procedure is given in Pseudo Code 3. 
By applying this procedure, Data_Set_Number named global variable sub¬ 
sequently set to the value of number of valid grids in the IMD data-set, which 
specifically lies within India’s administrative boundary. This global variable 
is further used in many modules for the further processing, where there is a 
requirement to obtain the number of grids which has the valid value and within 
the administrative boundary of India. 

Subsequently, the algorithm presented in Pseudo Code 4 is used to de¬ 
termine the index value of valid grids from grid-matrix, which specifically 
lies within the administrative boundary of India. At first, algorithm identi¬ 
fies the right grid-matrix containing specific number of valid grids (assigned 
to Data_Set_Number). As shown in Figure 6.1, index values required for 
generating the spatial data for any particular grid have been acquired with 
reference to origin grid. These relative coordinates are calculated in form of 
row and column indices (i,j) of the grid. The algorithm has been developed 
to acquire these indices and to save these indices in hie that has been named 
as INDEX_FILE. csv. The pseudo code of the algorithm is presented in Pseudo 
Code 4. File INDEX_FILE. csv is used as input for processing the data-set. It 
also serves as a metadata of grids for IMD precipitation data in India. The 
metadata stored in the hie INDEX_FILE. csv was utilized to obtain the separate 
grid data out of the cluster of the grid-matrix required for the downscaling of 
the precipitation data. Using the algorithm, the relevant grid data from each 
year’s data hie has been extracted. The algorithm is repeatedly applied for 
the corresponding data of each year data in the temporal duration. IMD pre¬ 
cipitation data-set used for validation has temporal duration of 35 years. For 
every year, total number of grid hies on which the algorithm is applied is 1251. 



Pseudo Code 3 . To determine data set number out of all grids 

1 Set Count —> 0 

2 Set Data_Set_Number —» 0 

3 Set File_List —> [IMD Files] 

4 foreach file in File_ List do 

5 foreach DataSet[Row][CoL] in file do 

6 for row = 1 to Row do 

7 for col = 1 to Col do 

8 if DataSet [row][col] > 0 then 

9 Count = Count + 1 

10 end 

11 end 

12 end 

13 Data_Set_Number = Data_Set_Number + 1 

14 if Count == Grid_ Count then 

is Goto Outside Loop 

1 6 end 

17 Count = 0 

is end 

19 end 

20 Outside Loop : 


2i Print Data Set Number 





Pseudo Code 4 . To determine indices of the grids. 

1 Create INDEXFILE 

2 Open INDEX FILE as ifile 

3 Set Count —> 0 

4 Set File_List —* [IMD Files] 
s Write(“X,Y”) in ifile 

6 foreach file in File_ List do 

7 Count = 0 

8 foreach DataSet[Row][CoL] in file do 

9 Count = Count + 1 

10 if Count == Data_Set_ Number then 

11 for row = 1 to Row do 

12 for col = 1 to Col do 

13 if DataSet [row][col] > 0 then 

14 Write( row, col )in ifile 

is end 

16 end 

17 end 

is Goto Outside Loop 

19 end 

20 end 

21 end 

22 Outside Loop : 

23 Save ifile 


24 Close ifile 





Pseudo Code 5 . To extracting data for the grids. 

1 Open INDEX FILE as meta 

2 Read(“X,Y”) from meta 

3 while Read( x, y ) from meta till EOF do 

4 String Grid _ Name = ToString(x“X” y,“Y.dat”) 

5 Create Grid_Name 

6 Open Grid__Name as grid 

7 Open IMD Year One File as year 

8 foreach DataSet[Row][CoLj in year do 

9 Write ( DataSet [x][y] ) in grid 

10 end 

n Save grid 

12 Close grid 

13 Close year 

14 end 


is Close meta 





Using the algorithm discussed in Pseudo Code 5, the hies comprising of 
precipitation data are creating programmatically for 35 years. Each hie com¬ 
prises of precipitation data of a year for a particular grid. Figure 6.1 depicts 
the directory comprising of these hies. 


Q05X35Y.dat 

Q09X33Y.dat 

QllX32Y.dat 

Ql2X44Y.dat 

Ql4X29Y.dat 

Ql5X23Y.dat 

Ql5X55Y.dat 

Ql6X38Y.dat 

Ql7X20Y.dat 

Ql7X47Y.dat 

Ql8X24Y.dat 

Ql8X51Y.dat 

Ql9X 

1 _! 05X36Y.dat 

J 09X34Y.dat 

HX33Y.dat 

^ 12X45Y.dat 

14X30Y.dat 

_15X24Y.dat 

_15X56Y.dat 

_ 16X39Y.dat 

□ 17X21Y.dat 

[j 17X48Y.dat 

18X25Y.dat 

L_ 18X52Y.dat 

_19X 

j_J06X34Y.dat 

Q09X35Y.dat 

QllX34Y.dat 

□ l3X32Y.dat 

Ql4X31Y.dat 

Ql5X25Y.dat 

Ql5X57Y.dat 

Ql6X40Y.dat 

Ql7X22Y.dat 

Ql7X49Y.dat 

Ql8X26Y.dat 

Ql8X53Y.dat 

Qi9X 

1 _! 06X35Y.dat 

_09X36Y.dat 

L_llX35Y.dat 

^ 13X33Y.dat 

_14X32Y.dat 

i_j 15X26Y.dat 

_15X59Y.dat 

_ 16X41Y.dat 

17X23Y.dat 

u 17X50Y.dat 

_18X27Y.dat 

^ 18X54Y.dat 

1 _ 1 19X 

l_J06X36Y.dat 

Q09X39Y.dat 

□ llX36Y.dat 

Ql3X34Y.dat 

□ 14X33 Y.dat 

Ql5X27Y.dat 

□ l5X60Y.dat 

Ql6X42Y.dat 

Ql7X24Y.dat 

Ql7X51Y.dat 

□ 18X28Y.dat 

□ l8X55Y.dat 

□ 19X 

l_J07X31Y.dat 

□ 09X40Y.dat 

QllX37Y.dat 

□ 13X35Y.dat 

□ 14X34Y.dat 

Ql5X28Y.dat 

□ l5X61Y.dat 

Ql6X43Y.dat 

Ql7X25Y.dat 

□ l7X52Y.dat 

□ 18X29Y.dat 

□ l8X56Y.dat 

Ql9X 

□ 07X32Y.dat 

09X41Y.dat 

_HX38Y.dat 

^ 13X36Y.dat 

_14X35Y.dat 

_15X29Y.dat 

^ 15X62Y.dat 

_16X44Y.dat 

_17X26Y.dat 

^ 17X54Y.dat 

18X30Y.dat 

_18X57Y.dat 

_19X 

[j07X34Y.dat 

□ 09X42Y.dat 

□ HX39Y.dat 

Ql3X37Y.dat 

Ql4X36Y.dat 

□ l5X30Y.dat 

j_J16X18Y.dat 

□ l6X45Y.dat 

□ l7X27Y.dat 

□ 17X55Y.dat 

□ 18X31 Y.dat 

□ l8X58Y.dat 

□ *9X 

[_J07X35Y.dat 

□ 09X43Y.dat 

Q llX40Y.dat 

□ 13X38Y.dat 

Ql4X37Y.dat 

Q 15X31Y.dat 

□ 16X19Y.dat 

□ l6X46Y.dat 

□ l7X28Y.dat 

□ 17X56Y.dat 

□ 18X32Y.dat 

□ l8X59Y.dat 

Ql9X 

^ 07X36Y.dat 

_j 09X44Y.dat 

□ UX41Y.dat 

l_j 13X39Y.dat 

□ 14X38Y.dat 

Ql5X32Y.dat 

□ 16X20Y.dat 

_16X47Y.dat 

□ l7X29Y.dat 

^ 17X57Y.dat 

Ql8X33Y.dat 

□ l8X60Y.dat 

Ql9X 

|_] 07X42Y.dat 

□ l0X30Y.dat 

□ llX42Y.dat 

Ql3X40Y.dat 

Ql4X39Y.dat 

Q 15X33Y.dat 

Ql6X21Y.dat 

Q 16X48Y.dat 

□ l7X30Y.dat 

□ 17X58Y.dat 

□ 18X34Y.dat 

Q 18X61Y.dat 

□ 19X 

1_107X43Y.dat 

Q 10X31Y.dat 

[_j UX43Y.dat 

□ 13X41 Y.dat 

□ l4X40Y.dat 

□ l5X34Y.dat 

□ 16X22Y.dat 

Ql6X49Y.dat 

□ l7X31Y.dat 

Q 17X59Y.dat 

Q 18X35Y.dat 

□ l8X62Y.dat 

Q19X 

i_j08X30Y.dat 

QlOX32Y.dat 

l_jllX44Y.dat 

Ql3X42Y.dat 

□ l4X41Y.dat 

[_] 15X35Y.dat 

Ql6X23Y.dat 

Ql6X50Y.dat 

[jl7X32Y.dat 

Ql7X60Y.dat 

Ql8X36Y.dat 

U 19XllY.dat 

[Jl9X 

jj08X31Y.dat 

QlOX33Y.dat 

□ l2X30Y.dat 

Ql3X43Y.dat 

Ql4X42Y.dat 

□ l5X36Y.dat 

Ql6X24Y.dat 

Ql6X54Y.dat 

Ql7X33Y.dat 

Ql7X61Y.dat 

Ql8X37Y.dat 

Ql9X12Y.dat 

Ql9X 

jj08X32Y.dat 

10X34Y.dat 

Ql2X31Y.dat 

Ql3X44Y.dat 

Ql4X43Y.dat 

Ql5X37Y.dat 

Ql6X25Y.dat 

_16X55Y.dat 

Ql7X34Y.dat 

Ql7X62Y.dat 

Ql8X38Y.dat 

Ql9X13Y.dat 

□ i9X 

□ 08X33Y.dat 

10X35Y.dat 

^ 12X32Y.dat 

^ 13X45Y.dat 

_ 14X44Y.dat 

Q 15X38Y.dat 

Q 16X26Y.dat 

_ 16X56Y.dat 

^ 17X35Y.dat 

18X12Y.dat 

_ 18X39Y.dat 

19X14Y.dat 

|_J19X 

08X34Y.dat 

QlOX36Y.dat 

Ql2X33Y.dat 

_13X46Y.dat 

_14X45Y.dat 

Ql5X39Y.dat 

j_] 16X27Y.dat 

□ 16X57Y.dat 

Ql7X36Y.dat 

|_j 18X13Y.dat 

_ 18X40Y.dat 

Ql9X15Y.dat 

_19X 

^ 08X35Y.dat 

_10X37Y.dat 

Ql2X34Y.dat 

^ 13X60Y.dat 

_14X46Y.dat 

Ql5X40Y.dat 

^ 16X28Y.dat 

lJ 16X58Y.dat 

Ql7X37Y.dat 

18X14Y.dat 

_18X41Y.dat 

Ql9X16Y.dat 

l_,19X 

[_J08X36Y.dat 

QlOX38Y.dat 

Ql2X35Y.dat 

Ql3X61Y.dat 

Ql4X47Y.dat 

Ql5X41Y.dat 

Ql6X29Y.dat 

□ l6X59Y.dat 

□ l7X38Y.dat 

Ql8X15Y.dat 

Ql8X42Y.dat 

Ql9X17Y.dat 

Ql9X 

, 08X40Y.dat 

_10X39Y.dat 

_12X36Y.dat 

u 14X21Y.dat 

14X48Y.dat 

^ 15X42Y.dat 

^ 16X30Y.dat 

_16X60Y.dat 

_17X39Y.dat 

u 18X16Y.dat 

_18X43Y.dat 

_19X18Y.dat 

_19X 

jj08X41Y.dat 

□ l0X40Y.dat 

Ql2X37Y.dat 

Ql4X22Y.dat 

□ 14X60Y.dat 

Q 15X43Y.dat 

□ 16X31 Y.dat 

□ 16X61 Y.dat 

Ql7X40Y.dat 

Q 18X17Y.dat 

Q 18X44Y.dat 

Q 19X19Y.dat 

Ql9X 

L_j08X42Y.dat 

Q 10X41Y.dat 

|_l 12X38Y.dat 

Ql4X23Y.dat 

Ql4X61Y.dat 

□ 15X44Y.dat 

□ l6X32Y.dat 

Ql6X62Y.dat 

□ 17X41Y.dat 

□ l8X18Y.dat 

Q 18X45Y.dat 

□ 19X20Y.dat 

Ql9X 

l_j08X43Y.dat 

□ 10X42Y.dat 

□ l2X39Y.dat 

Ql4X24Y.dat 

Q 14X62Y.dat 

□ l5X45Y.dat 

□ l6X33Y.dat 

Ql7X15Y.dat 

□ 17X42Y.dat 

□ l8X19Y.dat 

□ 18X46Y.dat 

Ql9X21Y.dat 

Ql9X 

jJ08X44Y.dat 

QlOX43Y.dat 

□ l2X40Y.dat 

□ 14X25Y.dat 

□ l5X19Y.dat 

l_j 15X46Y.dat 

□ 16X34Y.dat 

Ql7X16Y.dat 

Ql7X43Y.dat 

Ql8X20Y.dat 

Q 18X47Y.dat 

□ l9X22Y.dat 

□ »X 

jJ09X30Y.dat 

Q 10X44Y.dat 

□ l2X41Y.dat 

□ 14X26Y.dat 

□ 15X20 Y.dat 

□ 15X47Y.dat 

□ l6X35Y.dat 

Ql7X17Y.dat 

Ql7X44Y.dat 

□ 18X21 Y.dat 

□ 18X48Y.dat 

□ l9X23Y.dat 

□ »X 

L_, 09X31Y.dat 

Q llX30Y.dat 

l_jl2X42Y.dat 

Ql4X27Y.dat 

Q15X21 Y.dat 

Ql5X48Y.dat 

Ql6X36Y.dat 

Q 17X18Y.dat 

Ql7X45Y.dat 

Q 18X22Y.dat 

□ 18X49Y.dat 

Ql9X24Y.dat 

Ql9X 

1_J 09X32Y.dat 

QllX31Y.dat 

Ql2X43Y.dat 

Ql4X28Y.dat 

□ 15X22Y.dat 

Ql5X49Y.dat 

Ql6X37Y.dat 

Ql7X19Y.dat 

□ l7X46Y.dat 

Ql8X23Y.dat 

□ 18X50Y.dat 

Ql9X25Y.dat 

Ql9X 

m| ► 


Figure 6.1: Seperated structured grid hies from raw data. 


Naming of the hies has been carried out on basis of grid metadata stored 
in hie INDEX_FILE. csv. This compilation of data will results in 43785 hies 
(= 1251 x 35) having precipitation data of the valid grids. For each grid hies 
comprising of precipitation data for each year are saved in 35 number of hie 
directories. For collating the precipitation data of 35 years as a single hie 
for each grid, algorithm as given in Pseudo Code 6 has been developed and 
implemented. By implementing the algorithm presented in Pseudo Code 6, all 
grid hies in a directory with all year data within it have been retrieved. For 
the further processing this directory is renamed as IMD_Grids. 

For implementing downscaling, spatial data of the IMD grids as shown in 
hgure 6.2 is also required. In order to generate spatial meta corresponding 
to each valid grid, base data as given in INDEX_FILE. csv is utilized. This 
spatial metadata is stored in WKT format. WKT format is already discussed 
in earlier section. 
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Pseudo Code 6. To collate precipitation data of 35 years for each 
grid. 

1 String Parent_Dir 

2 Set Parent_Dir -a LocateParentDir() 

3 for Count — 1 to Grid_ Count do 

4 OpenDir ParentDir as pdir 

5 dir_name — ReadDir pdir 

6 OpenDir dir name as dir 

7 for Count = 1 to Grid_ Count do 

8 file_name — ReadDir dir 

9 end 

10 CloseDir dir 

n Open file_name as main_file 

12 while Read( value ) from main_file till EOF do EmptyLoop 

13 for ( dir_name — ReadDir (pdir) ) / NULL do 

14 OpenDir dir_name as dir 

15 for F_ Count — 1 to Count do 

16 file_name = ReadDir ( dir ) 

17 end 

18 Open file_name as temp__file 

19 while Read( value ) from temp_ file till EOF do 

20 Write( value ) in main file 

21 end 

22 Close temp_file 

23 CloseDir dir 

24 end 

25 CloseDir pdir 

26 Save main file 

27 Close main file 


28 end 





For this research, data structure as specified in WKT documentation [ISO/IEC 
13249-3: 2016] has been utilized for generating point and polygon vector data¬ 
set. However other data structure and other details can be used to extending 
up the EMSDM. Procedure for creating metadata of spatial component of IMD 
grids is given in Pseudo Code 7. 

The file IMD_Grid.wkt which is obtained by aforementioned algorithm is a 
spatial metadata for IMD grids. Figure 6.2 depicts the basic structure of the 

IMD_Grid. wkt file. 
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Figure 6.2: Seperated structured grid files from raw data. 
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Pseudo Code 7. To creating spatial metadata for IMD grids. 

1 Set GRID_START_LONG -a START_LONG 

2 Set GRID_START_LAT -a START LAT 

3 Set OFFSET -a 0.5 

4 Set C_OFFSET -a 0.25 

5 Create IMD_Grid.wkt 

6 Open IMD_Grid.wkt as s_meta 

7 Open INDEX_FILE as meta 

8 Read("X,Y") from meta 

9 WRlTE("POLYGON;XY;X;Y;Longitude;Latitude") in s_meta 
10 while Read( x, y ) from meta till EOF do 

n Min_ Long = GRID _ START _ LONG + OFFSET * ( x - 1 ) 

12 Max_Long = MinLong + OFFSET 

13 Min Lat = GRID START LAT + OFFSET * ( y - 1 ) 

14 Max_Lat — Min_Lat + OFFSET 

is C_Long = Min Long + C_OFFSET 

16 C Lat - Min_Lat + C_OFFSET 

17 Write ( "\n"POLYGON((" ) in s_meta 

18 Write( Min_Long, Min_Lat, ) in s_meta 

19 Write( Max_Long, Min_Lat, ) in s_meta 

20 Write( Max_Long, Max_Lat, ) in s_meta 

21 Write( Min_Long, Max_Lat. ) in s_meta 

22 Write( Min_Long, Min_Lat, ) in s_meta 

23 Write( x"X"y"Y;" ) in s_meta 

24 Write( x, y, ) in s_meta 

25 Write( C_Long, C_Lat, ) in s_meta 

26 end 

27 Save s_meta 

28 Close s meta 


29 Close meta 





The metadata file viz. IMD_Grid.wkt was used for generating shapefiles 
data. Quantum GIS (QGIS) was used for generation of spatial layer using 
WKT format Following procedure has been developed and implemented for 
importing WKT file as spatial layer in QGIS. WKT file is referenced to WGS84 
projection system: 

Open Qgis 

Goto Layer 

Goto Add Layer 

Goto Add Delimite Text Layer 
Browse WKT file 

Set File Format as Custom Delimiters 
Set delimiters as 
Semicolon 
Quote " 

Escape " 

Record Options 

Number of header lines to discard 0 
First record has field names checked 
Geometry definition 

Well known text (WKT) checked 
Geometry field POLYGON 
Press OK 

Coordinate reference System Selector 
Coordinate reference System WGS8f 
Press OK 
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Moreover, WKT file can also be converted to shapefile format using export 
tool of QGIS software. Shapefile is one of the most commonly used spatial data 
format. Following procedure was developed and implemented for converting 
the spatial layer generated from aforementioned layer to shapefile format: 

In Layer Panel 
Right Click on Layer 
Save As 

In Window Save vector layer as... 

Format ESRI Shapefile 

Save as set File_Name 

CRS EPSG:4326, WGS 84 
Press OK 

Moreover if the user does not have software which can support the WKT hie 
format, then user can use shape-file for processing and subsequently analysis. 
For example, GIS software namely DIVA GIS software does not support WKT 
hie format. Hence user of DIVA GIS can use shape-hle format for visual 
analysis. 
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6.4 Pre-Processing of GCM Data-sets (Predictors) 

6.4.1 GCM Data-Set (CanESM2) 

Besides, precipitation data-set from IMD, GCM data-set has been required 
for preprocessing. As discussed in earlier section, CanESM2 as GCM has 
been used for validating the EMSDM. CanESM2 stands for second generation 
of Canadian Earth System Model. Canadian Centre for Climate Modelling 
and Analysis (CCCma) of Environment Canada developed this climate model. 
CanESM2 is the 4th generation coupled global climate model. In IPCC 5 th 
Assessment Report (AR5) this climate model is contributed by CCCma. 

6.4.2 Preprocessing of CanESM2 

CanESM2 is divided in 128 x 64 grid cells to cover global domain. Spatial res¬ 
olution of each grid is ~ 2.7906° x 2.8125°, which is uniform in along longitude 
and along latitude it is with a rough resolution of 2.7906°. Each grid’s data 
is stored in a folder named BOX_HiX_jjY where in and jj represent the 
longitudinal and latitudinal indices. The index detail or grid definition with 
the longitude and latitude of the centre of grid were obtained from Canadian 
Centre for Climate Modelling and Analysis (CCCma) website [CCCma, 2018]. 

6.4.2.1 Generation of Spatial Metadata for GCM Grids 

GCM_CanESM2_X_Longitude. csv and GCM_CanESM2_Y_Latitude. csv as in ta¬ 
bles 6.3 and 6.4 are used to provide the grid index information of GCM box. 
These two files are used to create a spatial metadata file GCM_CanESM2_Ploygon. wkt. 
Using this algorithm, user can generate this spatial metadata file with the help 
of Longitude Index metadata file and Latitude Index metadata file. File ob¬ 
tained by this algorithm comprises of the grids covering the entire Earth. The 
algorithm to create the spatial metadata file is given in Pseudo Code 8. 
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Table 6.3: Index of X and its Corresponding Longitude 


Index of X and its Corresponding Longitude 

X(iii) 

Longitude 

X(iii) 

Longitude 

X(iii) 

Longitude 

X(iii) 

Longitude 

001 

000.0000 

033 

090.0000 

065 

180.0000 

097 

270.0000 

002 

002.8125 

034 

092.8125 

066 

182.8125 

098 

272.8125 

003 

005.6250 

035 

095.6250 

067 

185.6250 

099 

275.6250 

004 

008.4375 

036 

098.4375 

068 

188.4375 

100 

278.4375 

005 

011.2500 

037 

101.2500 

069 

191.2500 

101 

281.2500 

006 

014.0625 

038 

104.0625 

070 

194.0625 

102 

284.0625 

007 

016.8750 

039 

106.8750 

071 

196.8750 

103 

286.8750 

008 

019.6875 

040 

109.6875 

072 

199.6875 

104 

289.6875 

009 

022.5000 

041 

112.5000 

073 

202.5000 

105 

292.5000 

010 

025.3125 

042 

115.3125 

074 

205.3125 

106 

295.3125 

011 

028.1250 

043 

118.1250 

075 

208.1250 

107 

298.1250 

012 

030.9375 

044 

120.9375 

076 

210.9375 

108 

300.9375 

013 

033.7500 

045 

123.7500 

077 

213.7500 

109 

303.7500 

014 

036.5625 

046 

126.5625 

078 

216.5625 

110 

306.5625 

015 

039.3750 

047 

129.3750 

079 

219.3750 

111 

309.3750 

016 

042.1875 

048 

132.1875 

080 

222.1875 

112 

312.1875 

017 

045.0000 

049 

135.0000 

081 

225.0000 

113 

315.0000 

018 

047.8125 

050 

137.8125 

082 

227.8125 

114 

317.8125 

019 

050.6250 

051 

140.6250 

083 

230.6250 

115 

320.6250 

020 

053.4375 

052 

143.4375 

084 

233.4375 

116 

323.4375 

021 

056.2500 

053 

146.2500 

085 

236.2500 

117 

326.2500 

022 

059.0625 

054 

149.0625 

086 

239.0625 

118 

329.0625 

023 

061.8750 

055 

151.8750 

087 

241.8750 

119 

331.8750 

024 

064.6875 

056 

154.6875 

088 

244.6875 

120 

334.6875 

025 

067.5000 

057 

157.5000 

089 

247.5000 

121 

337.5000 

026 

070.3125 

058 

160.3125 

090 

250.3125 

122 

340.3125 

027 

073.1250 

059 

163.1250 

091 

253.1250 

123 

343.1250 

028 

075.9375 

060 

165.9375 

092 

255.9375 

124 

345.9375 

029 

078.7500 

061 

168.7500 

093 

258.7500 

125 

348.7500 

030 

081.5625 

062 

171.5625 

094 

261.5625 

126 

351.5625 

031 

084.3750 

063 

174.3750 

095 

264.3750 

127 

354.3750 

032 

087.1875 

064 

177.1875 

096 

267.1875 

128 

357.1875 
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Table 6.4: Index of Y and its corresponding Latitude 


Index of Y and its corresponding Latitude 


Latitude 

Y(jj) 

Latitude 

Y(jj) 

Latitude 

Y(jj) 

Latitude 


-87.863 

17 

-43.254 

33 

01.395 

49 

46.044 


-85.096 

18 

-40.463 

34 

04.185 

50 

48.835 


-82.312 

19 

-37.673 

35 

06.976 

51 

51.625 

04 

-79.525 

20 

-34.882 

36 

09.767 

52 

54.416 


-76.736 

21 

-32.091 

37 

12.557 

53 

57.206 

06 

-73.947 

22 

-29.301 

38 

15.348 

54 

59.997 


-71.157 

23 

-26.51 

39 

18.138 

55 

62.787 

08 

-68.367 

24 

-23.72 

40 

20.929 

56 

65.577 

09 

-65.577 

25 

-20.929 

41 

23.72 

57 

68.367 

10 

-62.787 

26 

-18.138 

42 

26.51 

58 

71.157 

11 

-59.997 

27 

-15.348 

43 

29.301 

59 

73.947 

12 

-57.206 

28 

-12.557 

44 

32.091 

60 

76.736 

13 

-54.416 

29 

-09.767 

45 

34.882 

61 

79.525 

14 

-51.625 

30 

-06.976 

46 

37.673 

62 

82.312 

15 

-48.835 

31 

-04.185 

47 

40.463 

63 

85.096 

16 

-46.044 

32 

-01.395 

48 

43.254 

64 

87.863 
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Pseudo Code 8. To creating spatial metadata for GCM grids. 

1 Open GCM_CanESM2_X_Longitude.txt as meta_x 

2 Open GCM_CanESM2_Y_Latitude.txt as meta_y 

3 Create GCM_CanESM2_Polygon.wkt 

4 Open GCM_CanESM2_Polygon.wkt as s_meta 

5 WRlTE("POLYGON;XY;X;Y;Longitude;Latitude") in s_meta 

6 Read( X, Min_Long ) from meta_x 

7 while Read( X, Max_Long ) from meta_x till EOF do 

8 C Long = Min_Long + 0.5 * ( Max_Long - Min_Long ) 

9 Rewind meta_y 

10 Read( Y, Min_Lat ) from meta_y 
n while Read( Y, Max_Lat ) from meta_y till EOF do 
12 C_Lat — Min_Lat + 0.5 * ( Max_Lat - Min_Lat ) 

is Write( "\n"POLYGON((" ) in s_meta 

14 Write( Min_Long, Min_Lat, ) in s_meta 

is Write( Max_Long, Min_Lat, ) in s_meta 

16 Write( Max_Long, Max_Lat, ) in s_meta 

17 Write( Min _ Long. Max_Lat, ) in s_meta 

18 Write( Min_Long, Min_Lat, ) in s_meta 

19 Write( x"X"y"Y;" ) in s_meta 

20 Write( x, y, ) in s_meta 

21 Write( C_Long, C_Lat, ) in s_meta 

22 Min_Lat — Max_Lat 

23 end 

24 MinLong = Max_Long 

25 end 

26 Save s_meta 

27 Close s_meta 

28 Close meta_y 


29 Close meta x 





6 .4.2.2 Generation of Metadata for GCM Grids of AOI 


Algorithm from Pseudo Code 8 produces grids for the world. For EMSDM, 
CanESM2 grids that cover India are required, which are obtained by spatial 
overlay analysis of IMD_Grid.wkt and GCM_CanESM 2 _Polygon.wkt data-sets. 
This spatial overlay analysis is carried out under QGIS environment. The 
procedure is automated in the EMSDM and used for extracting the relevant 
CanESM2 grids is discussed below: 

Open Qgis 

Goto Layer 

Goto Add Layer 

Goto Add Delimite Text Layer 

Browse GCM_ CanESM2_ Polygon, wkt file 
Press OK 

Coordinate reference System Selector 

Coordinate reference System WGS8f 
Press OK Goto Layer 
Goto Add Layer 

Goto Add Delimite Text Layer 

Browse GCM_ CanESM2_ Polygon, wkt file 
Press OK 

Coordinate reference System Selector 

Coordinate reference System WGS8f 
Press OK Goto Vector 
Goto Research Tools 
Goto Select by location 

Set Layer to Select From: GCM_ CanESM2_ Polygon 

Set Additional layer : IMD_ Grid 
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Geometric Predicate : Overlaps 

Modify current Selection by : Creating new Selection 

Press RUN 

In Layer Panel 

Select GCM_ CanESM2_ Polygon 
Right Click on Layer 
Save As 

In Window Save vector layer as... 


Format : 

Comma Separated Value [CSV] 

Save as : 

set File Name 

CRS : 

EPSG:4326, WGS 84 

Encoding: 

Save Only Selected Features 


Press OK 

Using the procedure discussed above, the relevant CanESM2 grids that 
corresponds to India have been obtained. These grids data was stored as 
GCM_CanESM2_Polygon_India. csv. This hie comprises of the information of 
the grids that are required for downscaling of India. 

6 .4.2.3 Extraction of Spatial Metadata for GCM Grids of AOI 

The metadata created in section 6.4.2.2 is used to filter out the required 
grid from GCM_CanESM2_Polygon.wkt for generation of spatial metadata 
of CanESM2 grids. The algorithm presented in Pseudo Code 9 is used to segre¬ 
gate the spatial metadata of AOI from the spatial metadata generated through 
the algorithms and procedures discussed in section 6.4.2.1. This segregation 
process is carried out using the spatial metadata for GCM grids and metadata 
for GCM grids of AOI. These metadata hies are obtained from the algorithms 
and procedures discussed in section 6.4.2.1 and section 6.4.2.2. 
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Pseudo Code 9. To extract spatial metadata of GCM grids for AOI. 

1 Set Run_It —>■ TRUE 

2 Open GCM_Name_Polygon.wkt as s_meta 

3 Open GCM_Name_Polygon_aoi.csv as meta 

4 Create GCM_Name_Polygon_aoi.wkt 

5 Open GCM_Name_Polygon_aoi.wkt as s_meta_aoi 

6 R,EAD("POLYGON;XY;X;Y;Longitude;Latitude") from s_meta 
r Read("XY") from meta 

8 WRiTE("POLYGON;XY;X;Y;Longitude;Latitude") in s_meta_aoi 

9 while Re ad( x"X"y"Y" ) from meta till EOF do 

10 while Run_It do 

11 Read( "\n"POLYGON((" ) from s_meta 

12 Read( Min_Long, Min_Lat, ) from s_meta 

13 Read( Max_Long, Min_Lat, ) from s_meta 

14 Read( Max_Long, Max_Lat, ) from s_meta 

15 Read( Min_Long, Max_Lat, ) from s_meta 

16 Read( Min_Long, Min_Lat, ) from s_meta 

it Read) x"X"y"Y;" ) from s_meta 

18 Read) x, , y, ) from s_meta 

19 Run_It = Read( C_Long, C_Lat, ) from s_meta 

20 if x == X && y == Y then 

21 Write( "\n"POLYGON(( M ) in s_meta_aoi 

22 Write( Min_Long, Min_Lat, ) in s_meta_aoi 

23 Write( Max_Long, Min_Lat, ) in s_meta_aoi 

24 Write( Max_Long, Max_Lat, ) in s_meta_aoi 

25 Write( Min_Long, Max_Lat, ) in s_meta_aoi 

26 Write( Min_Long, Min_Lat, "))\ M ;" ) in s_meta_aoi 

27 Write) x"X"y"Y;" ) in s_meta_aoi 

28 Write) x, y, ) in s_meta_aoi 

29 Write) C_Long, C_Lat, ) in s_meta_aoi 

30 Break 

31 end 

32 end 

33 end 

34 Save s_meta_aoi 

35 Close s_meta_aoi 

36 Close s meta 


37 Close meta 





6.4.2.4 Extraction of GCM Grid Data-Sets for AOI 

When downscaling is performed for large spatial extent (AOI), various GCM 
grids are associated with AOI. For obtaining the GCM grids data-set from the 
official data site, there is a need of automation of this process. Pseudo Code 
10 is used to create a URL hie. This hie is used by the EMSDM to automat¬ 
ically download the required GCM parametric data in form of zip hies from 
the official data site. 

Pseudo Code 10. To create url hie to obtaine GCM Grids data-sets. _ 

1 Set Part_URL 

2 Set Full _ URL 

3 Open GCM_CanESM2_Polygon_aoi.csv as meta 

4 Create GCM_CanESM2_Polygon_aoi.url 

5 Open GCM_CanESM2_Polygon_aoi.url as data_urls 

6 Read("XY") from meta 

7 while Read( x"X"y"Y" ) from meta till EOF do 

s Full URL = ToString(Part_URL, x, "X_", y, "Y.zip" ) 

9 Write( FullURL ) in data urls 
10 end 

n Save data_urls 

12 Close data nils 

13 Close meta 
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Pseudo Code 11. To find GCM grid corresponding to a predictand 
grid. 

1 Set Run_It —► TRUE 

2 Set DATA_SOURCE -»• “IMD” 

3 Set AOI_NAME —> “India” 

4 Set GCM_NAME -*■ “CanESM2” 

5 Set GRID_NAME -> “001X001Y” 

6 Set GRID_START_LONG -> 66.75 

7 Set GRID_START_LAT -> 6.75 

8 Set OFFSET —> 0.5 

9 Set C_OFFSET -»• 0.25 

10 String WKT_File 

11 Write( ". .\\Text_Files\\GCM_" GCM_NAME AOI_Name ".csv" ) in WKT_File 

12 Open WKT_File as meta 

13 Read( Grid_X "X" Grid_Y "Y" ) from GRID_NAME 

14 Grid_Long = GRID_START_LONG + 0.5 * ( Grid_X - 1 ) 
is Grid_Lat = GRID_START_LAT + 0.5 * ( Grid_Y - 1 ) 

16 R,EAD("POLYGON;XY;X;Y;Longitude;Latitude") from meta 

17 while Run_It do 

18 Read( "\n"POLYGON((" ) from meta 

19 Read( Min_Long, Min_Lat, ) from meta 

20 Read) Max_Long, Min_Lat, ) from meta 

21 Read( Max_Long, Max_Lat, ) from meta 

22 Read( Min_Long, Max_Lat, ) from meta 

23 Read( Min_Long, Min_Lat, ) from meta 

24 Read( x"X"y"Y;" ) from meta 

25 Read( x, y, ) from meta 

26 Run_It = Read( C_Long, C_Lat, ) from meta 

27 if Grid_Long > Min_Long then 

28 if Grid_Long < Max_Long then 

29 if Grid_ Lat > Min_ Lat then 

30 if Grid_Lat < Max_Lat then 

31 Write) "BOX_" x "X_" y "Y" ) in GCM_Grid_Name 

32 Break 

33 end 

34 end 

35 end 

36 end 

37 end 

38 Close meta 

39 Print GCM Grid Name 





Pseudo Code 12. To Generate Monthly GCM Grid Data-Set. 

1 String Temporal_Type, GCM_Dir, Box_Name, File_Name 

2 String File_Read_Dir, File_Write_Dir 

3 String File_Read_Path, _Write_Path 

4 Set Source_Dir —t "daily" 

5 Set Target_Dir[] —► { "monthly", "annual", "decadal" } 

6 Set GCM Dir -> Locate-GCM-Dir() 

7 Set GCM_SubDir[ ] -A Get-GCM-SubDir() 

8 Temporal_Type = Get-Entry( "Enter Required Temporal Type" ) 

9 if Temporal_ Type == "monthly" then n_tt = 1 

10 if Temporal—Type == "annual" then n_tt = 2 

11 if Temporal—Type == "decadal" then n_tt = 3 

12 OpenDir GCM_Dir as gcm_dir 

13 while ( BoX—Name — ReadDir ( gcm_dir ) ) != NULL do 

14 for g_i = 0 to n_GCM_SubDir do 

15 Write( "..\\data\\GCM\\" Source_Dir "\\" GCM_Name "\\" Box_Name "\\" 

GCM_SubDir[g_i]) in File_Read_Dir 

16 Write( "..\\data\\GCM\\" Target_Dir[n\_tt] "\\" GCM_Name "\\" Box_Name 

"\\" GCM_SubDir[g_i|) in File_Write_Dir 

17 OpenDir File_Read_Dir as file_dir 

18 while ( File_ Name = ReadDir ( Box_ Name ) ) != NULL do 

19 Write( File_Read_Dir "\\" File_Name ) in File_Read_Path 

20 Write( File_Write_Dir "\\" File_Name ) in File_Write_Path 

21 Open File_Read_Path as read_file 

22 Create File_Write_Path 

23 Open File_Write_Path as write_file 

24 if Temporal— Type == "monthly " then 

25 | ConvertToMonth( read_file, write_file ) 

26 end 

27 if Temporal—Type == "annual" then 

28 | ConvertToAnnual( read_file, write_file ) 

29 end 

30 if Temporal— Type == "decadal " then 

31 | ConvertToDecadal( read_file, write_file ) 

32 end 

33 Save write_file 

34 Close write_file 

35 Close read_file 

36 end 

37 CloseDir file_dir 

38 end 

39 end 

40 CloseDir gcm_dir 





Pseudo Code 13. To evaluate Pearson’s correlation coefficient using. 

1 String Correlation_Type 

2 String Data_Source 

3 String Climate_Variable 

4 String AOI_Name 

5 String GCM_Name 

6 String Temporal_Type 

7 String Input_File 

8 String Grid_Name 

9 String GCM_Grid_Dir 

10 String Station_Data_Path 

11 String Correlation Results Data Path 

12 String gcm_grid_found —> FALSE 

13 Data_Source = Get-Entry( "Enter Name of Data Source" ) 

14 AOI_Name = Get-Entry( "Enter Name of AOP' ) 

15 GCM_Name = Get-Entry( "Enter Name of GCM" ) 

16 Climate_Variable = Get-Entry( "Enter Name of Climate Variable" ) 

17 Correlation_Type = Get-Entry( "Enter Type of Correlation Type Required" ) 

18 if Correlation Type == "parametric" then 

19 | n_ctype = 1 

20 end 

21 if Correlation Type - "non-parametric" then 

22 | n_ctype = 2 

23 end 

24 Temporal_Type = Get-Entry( "Enter Required Temporal Type" ) 

25 WAite( "..\\Text_Files\\" Data_Source "_" AOI_Name "_Index.csv" ) in Input_File 

26 Open Input_File as meta_xy 

27 Read( "XY" ) from meta_xy 

28 while Read( Grid_ Name ) from meta_xy till EOF do 

29 gcm_grid_found = Find GCM Grid( Data_Source, AOI_Name, GCM_Name, Grid_Name ) 

30 if gem grid found then 

31 if n ctype —— 1 then 

32 Write( "..\\data\\" Data_Source "\\" Climate_Variable "\\" AOI_Name "\\" 

Temporal Type "\\" Grid Name ".csv" ) in Station Data Path 

33 Write( "..\\Output\\" GCM_Name "\\" Data_Source "\\" Climate_Variable "\\" 

AOI Name "\\" Temporal Type "\\correlation\\parametric\\" Grid Name ".csv" ) in 
Correlation Results Data Path 

34 Write( "..\\data\\GCM\\" Temporal_Type "\\" GCM_Name "\\" _GCM_GRID_ ) in 

GCM_Grid_Dir 

35 Correlation Pearson( Station_Data_Path, Correlation_Results_Data_Path, 

GCM_Grid_Dir ) 

36 end 

37 if n_ctype —— 2 then 

38 Write( "..\\data\\nonparametric\\" Data Source "\\" Climate Variable "\\" AOI Name 

"\\" Temporal Type "\\" Grid Name ".csv" ) in Station Data Path 

39 Write( "..\\Output\\" GCM_Name "\\" Data_Source "\\" Climate_Variable "\\" 

AOI Name "\\" Temporal Type "\\correlation\\nonparametric\\" Grid Name ".csv" ) in 
Correlation Results Data Path 

40 CreateRankIndex( GCM_Name, _GCM_GRID_ ) 

41 Write( "..\\data" Corr\_Type "\\GCM\\" Temporal_Type "\\" GCM_Name "\\" 

_GCM_GRID_ ) in GCM_Grid_Dir 

42 Correlation Spearman( Station_Data_Path, Correlation_Results_Data_Path, 

GCM_Grid_Dir ) 

43 end 

44 end 

45 end 

46 Close meta_xy 





Pseudo Code 14. To Downscaled Model using MLR and MvLR. 

1 String Regression Type 

2 String Data_Source 

3 String Temporal_Type 

4 String Climate_Variable 

5 String AOI_Name, GCM_Name, Grid_Name 

6 String Index_File, Input_File, Output_File 

7 Set gem_grid_found —>• FALSE 

8 Set Sum —> 0 

9 Set Count —y 0 

10 Data_Source = Get-Entry( "Enter Name of Data Source" ) 

11 AOI_Name = Get-Entry( "Enter Name of AOI" ) 

12 GCM_Name = Get-Entry( "Enter Name of GCM" ) 

13 Climate_Variable = Get-Entry( "Enter Name of Climate Variable" ) 

14 Regression Type = Get-Entry( "Enter Type of Regression Required" ) 

15 if Correlation Type == "MLR" then 

16 | n_rtype = 1 

17 end 

18 if Correlation Type == "MvLR" then 

19 | n_rtype = 2 

20 end 

21 Temporal_Type = Get-Entry( "Enter Required Temporal Type" ) 

22 Write( "..\\Text_Files\\" Data_Source "_" AOI_Name "_Index.csv" ) in Index_File 

23 Open Index_File as meta_xy 

24 Read( "XY" ) from meta_xy 

25 while Read/ Grid Name ) from meta xy till EOF do 
d_found = Find GCM Grid( Data_Source, AOI_Name, GCM_Name, Grid_Name ) 

grid found then 
n rtype —— 1 then 

Write( "..\\Output\\" GCM\_Name "\\" Data_Source "\\" Climate_Variable "\\" 
AOI Name "\\" Temporal Type "\\correlation\\parametric\\" Grid Name ".csv" ) in 
Input_File 

Write( "..\\Output\\" GCM_Name "\\" Data_Source "\\" Climate_Variable "\\" 
AOI Name "\\" Temporal Type "\\regression\\MLR\\" Grid Name ".csv" ) in 
Output_File 

if Regression- ML FI ( Input _ File, Output _ File ) then 
| Count = Count + 1 

end 

Sum = Sum + 1 

I 

n rtype ——2 then 

n Group = Get-Entry( "Enter Number of Regression Group" ) 

Write( "..\\Output\\" GCM\_Name "\\" Data_Source "\\" Climate_Variable "\\" 
AOI Name "\\" Temporal Type "\\correlation\\parametric\\" Grid Name ".csv" ) in 
Input_File 

Write( "..\\Output\\" GCM_Name "\\" Data_Source "\\" Climate_Variable "\\" 
AOI Name "\\" Temporal Type "\\regression\\MvLR\\" Grid Name ".csv" ) in 
Output_File 

if Regression- MvLR ( Input_ File, Output_ File, n_ Group ) then 
| Count = Count + 1 

end 

Sum = Sum + 1 

i 

46 end 

47 Print (Count/Sum) x 100 % Success Rate 

48 Close meta_xy 






6.5 Spatial Analysis 


Spatial Analysis has been carried out to spatially link the local grid and the 
GCM grid. The algorithm required for carrying out spatial analysis is pre¬ 
sented in Pseudo Code 11. The inputs required for spatial analysis are local 
grid data source, AOI name, GCM name, and grid id. The algorithm uses the 
aforementioned inputs to carry out the spatial overlay analysis in order to map 
the GCM grid with the local station grid. 

6.6 Temporal Analysis 

Temporal Analysis has been carried out to temporally map the local grid data¬ 
set and the it’s corresponding GCM grid data-set which is obtained from spatial 
analysis performed in previous section. The algorithm required to carrying out 
temporal analysis is presented in Pseudo Code 12. 

Temporal mapping of local grid data-set and GCM grid data-set has been 
carried out. Moreover, in to reduce the time complexity of the computational 
process for the further processing, temporal transformation of the local grid 
data-set as well as GCM grid data-set is also been carried out using Pseudo 
Code 12. 

6.7 Screening Predictors 

As discussed in the section 5.2.4, in this step, selection of set of dominating 
predictors for downscaling process has been carried out using the algorithm 
presented in Pseudo Code 13. 

In Pseudo Code 13, screening of predictors is carried out by using paramet¬ 
ric and non-parametric statistics. As discussed in chapter 4, while Spearman’s 
Correlation Statistics has been used as non-parametric statistics, Pearson Cor¬ 
relation Statistics has been used as parametric statistics. Pseudo Code 13 in- 
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ternally uses the GCM output dataset of the reference grids of specified AOI to 
compute statistics and applies the criteria discussed in section 5.2.4 to screen 
out the GCM parameters having the correlation less the user defined thresh¬ 
old. Moreover user can provide the weightage hie for the predictors for carry 
out downscaling using the specified set of predictors, to analyse the affect of 
these predictors on climate change. 

6.8 Model Generation 

Mathematical model solution discussed in the section 5.2.4, is implemented as 
algorithm for development of statistical downscaling model. This algorithm is 
presented in Pseudo Code 14. The algorithm automates the model develop¬ 
ment for the all the grids of the AOI. The inputs required for the development 
of the model are specified in lines 1-6 of the pseudo code. 

6.9 Concluding Remarks 

In this chapter algorithms and procedures required for implementing the EMSDM 
has been discussed in detail. Algorithms and procedures of EMSDM are pro¬ 
grammed in C language. Henceforth EMSDM can be implemented irrespective 
of the Operating Systems viz. Windows, Unix, Linux, macOS etc and possess 
advantage of interoperability over widely used Window Based SDSM software. 
Moreover, in addition of carrying out statistical downscaling, EMSDM gen¬ 
erates the interoperable spatial data for corresponding GCM and local grids. 
These spatial data can be used for different climate studies. After discussion 
on implementation of EMSDM in this chapter, in Chapter 7, applicability of 
EMSDM is demonstrated using CanESM2 as GCM data-set, IMD as local 
data-set and India as a selected AOI. 
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Chapter 7 

Application of EMSDM 

7.1 General 

In this chapter, the application of EMSDM is is presented to demonstrate its 
applicability for the statistical downscaling for the given area of interest. 

Local precipitation dataset of India acquired from Indian Meteorological 
Department (IMD) of 1241 spatial grids and corresponding GCM dataset com¬ 
prising of 60 spatial grids for generation of results. IMD data set is available as 
a hie with naming format as rixx_yyyy.grd, where xx denotes the resolution 
of grid in degrees and yyyy denotes the year of the mesurements of the precip¬ 
itation observations. As a primary climate data, precipitation data acquired 
from IMD is structured as the grids having resolution of 0.5° x 0.5°. Temporal 
resolution of IMD data-set is one day. Temporal duration of the precipitation 
data-set is from year 1971 to 2005. 

As a GCM, CanESM2 is acquired from CCCma is also structured as the 
grids with grid resolution 2.7906° x 2.8125°. Temporal resolution of predictors 
data-set obtained from this GCM is taken as one day. Temporal duration of the 
CanESM2 data-set is from year 1961 to 2005 for historical and NCEP/NCAR 
reanalysis data and from year 2006 to 2100 for scenarios RCP26, RCP45 and 
RCP85. 
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7.2 Pre-Processing of Local Data-sets (Predictand) 


Local data are preprocessed using the algorithms presented in sections 6.3 of 
Chapter 6. Spatial indices of the local stations are encoded in proper format. 
As shown in Figure 7.1, hie comprising of spatial indices of the local stations 
has been generated using the algorithm presented in section 6.3. File compris¬ 
ing of these spatial indices in CSV hie format are generated programmatically 
and stored in programmatically generated directory which are depicted in Fig¬ 
ure 7.2. 

As shown in Figure 7.3, spatial metadata for the IMD grids in WKT format 
are generated using the algorithms presented in section 6.3 and the indices 
depicted in Figure 7.1. 
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Figure 7.1: IMD-Index 
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Figure 7.2: IMD-Grids 
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Figure 7.3: IMD Spatial Metadata 
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The geo-visualization of the spatial metadata for the IMD grids in QGIS 


software is depicted in Figure 7.4, Figure 7.5 and Figure 7.6. 



Figure 7.4: IMD Spatial Metadata in QGIS 
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Figure 7.5: IMD Spatial Metadata Grids 
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Figure 7.6: IMD Spatial Metadata Grids Map With Zoom 


7.3 Pre-Processing of CanESM2 Data-sets (Pre¬ 
dictors) 

7.3.1 Generation of Spatial Metadata of CanESM2 Grids 

The spatial metadata of CanESM2 data-set are generated using the longitude 
and latitude indices given in Table 6.3 and Table 6.4 respectively and processed 
using the algorithms presented in sections 6.4 of Chapter 6. 

Spatial indices of the CanESM2 spatial meta-data are encoded in WKT 
format using the algorithm presented in section 6.4 of Chapter 06, which is 
shown in Figure 7.7. The map based visualization of CanESM2 spatial meta- 
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data is shown in Figure 7.8. 


gGCM_CanESM2.wkt E 


WKT;XY;X;Y; 
"POLYGON ( 
"POLYGON ( 

" POLYGON ( 
"POLYGON ( 
"POLYGON ( 
"POLYGON ( 

" POLYGON ( 
"POLYGON ( 

" POLYGON ( 

" POLYGON ( 
"POLYGON ( 
"POLYGON ( 
"POLYGON ( 
"POLYGON ( 
"POLYGON ( 
"POLYGON! 
"POLYGON! 
"POLYGON ( 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON ( 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 
"POLYGON! 


Longitude; 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 


Latitude 

-087.86300, 

-085.09600, 

-082.31200, 

-079.52500, 

-076.73600, 

-073.94700, 

-071.15700, 

-068.36700, 

-065.57700, 

-062.78700, 

-059.99700, 

-057.20600, 

-054.41600, 

-051.62500, 

-048.83500, 

-046.04400, 

-043.25400, 

-040.46300, 

-037.67300, 

-034.88200, 

-032.09100, 

-029.30100, 

-026.51000, 

-023.72000, 

-020.92900, 

-018.13800, 

-015.34800, 

-012.55700, 

-009.76700, 

-006.97600, 

-004.18500, 

-001.39500, 

0001.39500, 

0004.18500, 

0006.97600, 

0009.76700, 

0012.55700, 


-177.18750 

■177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

•177.18750 

•177.18750 

-177.18750 

-177.18750 

•177.18750 

-177.18750 

-177.18750 

-177.18750 

■177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

■177.18750 

-177.18750 

-177.18750 

-177.18750 

•177.18750 

-177.18750 

-177.18750 

-177.18750 

■177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

■177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 


-087.86300, 

-085.09600, 

-082.31200, 

-079.52500, 

-076.73600, 

-073.94700, 

-071.15700, 

-068.36700, 

-065.57700, 

-062.78700, 

-059.99700, 

-057.20600, 

-054.41600, 

-051.62500, 

-048.83500, 

-046.04400, 

-043.25400, 

-040.46300, 

-037.67300, 

-034.88200, 

-032.09100, 

-029.30100, 

-026.51000, 

-023.72000, 

-020.92900, 

-018.13800, 

-015.34800, 

-012.55700, 

-009.76700, 

-006.97600, 

-004.18500, 

-001.39500, 

0001.39500, 

0004.18500, 

0006.97600, 

0009.76700, 

0012.55700, 


-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 

-177.18750 


-085.09600, 

-082.31200, 

-079.52500, 

-076.73600, 

-073.94700, 

-071.15700, 

-068.36700, 

-065.57700, 

-062.78700, 

-059.99700, 

-057.20600, 

-054.41600, 

-051.62500, 

-048.83500, 

-046.04400, 

-043.25400, 

-040.46300, 

-037.67300, 

-034.88200, 

-032.09100, 

-029.30100, 

-026.51000, 

-023.72000, 

-020.92900, 

-018.13800, 

-015.34800, 

-012.55700, 

-009.76700, 

-006.97600, 

-004.18500, 

-001.39500, 

0001.39500, 

0004.18500, 

0006.97600, 

0009.76700, 

0012.55700, 

0015.34800, 


-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

■180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

■180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 

■180.00000 

-180.00000 

-180.00000 

-180.00000 

-180.00000 


-085.09600, 

-082.31200, 

-079.52500, 

-076.73600, 

-073.94700, 

-071.15700, 

-068.36700, 

-065.57700, 

-062.78700, 

-059.99700, 

-057.20600, 

-054.41600, 

-051.62500, 

-048.83500, 

-046.04400, 

-043.25400, 

-040.46300, 

-037.67300, 

-034.88200, 

-032.09100, 

-029.30100, 

-026.51000, 

-023.72000, 

-020.92900, 

-018.13800, 

-015.34800, 

-012.55700, 

-009.76700, 

-006.97600, 

-004.18500, 

-001.39500, 

0001.39500, 

0004.18500, 

0006.97600, 

0009.76700, 

0012.55700, 

0015.34800, 


-180.00000 

■180.00000 

180.00000 

•180.00000 

•180.00000 

•180.00000 

■180.00000 

•180.00000 

•180.00000 

•180.00000 

180.00000 

•180.00000 

•180.00000 

180.00000 

■180.00000 

180.00000 

180.00000 

•180.00000 

•180.00000 

■180.00000 

•180.00000 

•180.00000 

•180.00000 

•180.00000 

•180.00000 

•180.00000 

180.00000 

■180.00000 

180.00000 

180.00000 

•180.00000 

•180.00000 

■180.00000 

•180.00000 

•180.00000 

•180.00000 

•180.00000 


-087.86300))" 
-085.09600))" 
-082.31200))” 
-079.52500))■ 
-076.73600))" 
-073.94700))■ 
-071.15700))" 
-068.36700))" 
-065.57700))" 
-062.78700))" 
-059.99700))" 
-057.20600))" 
-054.41600))" 
-051.62500))” 
-048.83500))" 
-046.04400))” 
-043.25400))■ 
-040.46300))" 
-037.67300))■ 
-034.88200))" 
-032.09100))" 
-029.30100))" 
-026.51000))" 
-023.72000))" 
-020.92900))" 
-018.13800))" 
-015.34800))" 
-012.55700))" 
-009.76700))" 
-006.97600))" 
-004.18500))" 
-001.39500))" 
0001.39500))" 
0004.18500))" 
0006.97600))" 
0009.76700))" 
0012.55700))" 


066X02Y;066 
066X03Y;066 
066X04Y;066 
066XQ5Y;066 
066X06Y; 
066X07Y;066 
066X08Y;066 
066X09Y;066 
066X10Y;066 
066X11Y;066 
066X12Y;066 
066X13Y;066 
066X14Y;066 
066X15Y;066 
066X16Y;066 
066X17Y;066 
066X18Y;066 
066X19Y; 
066X20Y;066 
066X21Y;066 
066X22Y;066 
066X23Y;066 
066X24Y;066 
066X25Y;066 
066X26Y;066 
066X27Y;066 
066X28Y;066 
066X29Y;066 
066X30Y;066 
066X31Y;066 
066X32Y; 
066X33Y;066 
066X34Y;066 
066X35Y;066 
066X36Y;066 
066X37Y;066 
066X38Y;066 


-178.59375;-086. 
-178.59375;-083. 
-178.59375;—080. 
-178.59375;-078. 
-178.59375;-075. 
-178.59375;—072 . 
-178.59375;-069. 
-178.59375;-066. 
-178.59375;—064. 
-178.59375;—061. 
-178.59375;—058 . 
-178.59375;-055. 
-178.59375;-053. 
-178.59375;-050. 
-178.59375;—047 . 
-178.59375;—044. 
-178.59375;-041. 
-178.59375;-039. 
-178.59375;—036. 
-178.59375; -033. 
-178.59375;-030. 
-178.59375;—027 . 
-178.59375;-025. 
-178.59375;—022 . 
-178.59375;-019. 
-178.59375;-016. 
-178.59375;-013. 
-178.59375;-Oil. 
-178.59375;—008. 
-178.59375;-005. 
-178.59375;-002. 
-178.59375;0000. 
-178.59375;0002. 
-178.59375;0005. 
-178.59375;0008. 
-178.59375;0011. 
-178.59375;0013. 


47950 

70399 

91850 

13050 

34150 

55200 

76199 

97200 

18200 

39200 

60150 

81100 

02050 

23000 

43950 

64900 

85850 

06800 

27750 

48650 

69600 

90550 

11500 

32450 

53350 

74300 

95250 

16200 

37150 

58050 

79000 

00000 

79000 

58050 

37150 

16200 

95250 


length: 1,308,637 lines: 8,129 Ln:l Col;l Sel;0 


Windows (CRLF) UTF-8 


Figure 7.7: CanESM2 Spatial Metadata in WKT 
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Figure 7.8: CanESM2 Grids Map 


7.3.2 Generation of Metadata of CanESM2 Grids for India 


Metadata of CanESM2 Grids for India in form of indices are generated us¬ 
ing the overlay analysis of IMD and CanESM2 spatial grids using the proce¬ 
dure presented in section 6.4.2.2. The map based visualization of IMD and 
CanESM2 spatial grid layers are shown in Figure 7.9 before carrying out the 
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overlay analysis. Figure 7.10 depicts CanESM2 grids for India that are over¬ 
laid over IMD grids. Figure 7.11 depicts the CanESM2 grids for India in gray 
colour. The results of overlay analysis for CanESM2 are stored in Index file in 
csv file format as shown in Figure 7.12. 


-180.000 -150.000 -120.000 


120.000 150.000 180.000 



-180.000 -150.000 -120.000 -90.000 -60.000 -30.000 


30.000 60.000 90.000 120.000 150.000 180.000 


Figure 7.9: IMD and CanESM2 Grids Overlay Map 
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Figure 7.10: IMD Grids, CanESM2 Grids, and CanESM2 Grids for India 
Overlay Map 
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Figure 7.11: CanESM2 Grids, and CanESM2 Grids for India Overlay Map 
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Figure 7.12: CanESM2 Indices for India 
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7.3.3 Extraction of Spatial Metadata for CanESM2 Grids of India 

The spatial meta-data of CanESM2 for India is generated using the algorithm 
presented in section 6.4.2.3 using index file generated in section 6.4.2.2, which is 
shown in Figure 7.13. In Figure 7.14 the grids of spatial meta-data of CanESM2 
for India is shown in map format. In Figure 7.15, grids are shown with their 
indices. IMD and CanESM2 for India spatial grid layers are juxtaposed over 
each other that are depicted in Figure 7.16 and Figure 7.17. 
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Figure 7.13: WKT Spatial Metadata of CanESM2 for India 
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Figure 7.14: 


CanESM2 Grids for India Map 
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Figure 7.15: CanESM2 Grids for India Map with Zoom 
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Figure 7.16: IMD Grids and CanESM2 Grids for India Overlay Map 
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Figure 7.17: IMD Grids and CanESM2 Grids for India Zoomed Overlay Map 


7.4 Spatial Analysis 

The relevant CanESM2 grid corresponding to IMD grid is extracted using 
spatial analysis module generated by the algorithm presented in the Pseudo 
Code 11 as given in section 6.5. The process is depicted in Figure 7.18. The 
module takes four inputs ( viz. data-source name, AOI name, GCM name and 
grid Index) and produces the CanESM2 grid index. 



Figure 7.18: Spatial Analysis Result in Text Format 
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7.5 Temporal Analysis 


Temporal Transformation is applied to IMD and CanESM2 data-sets to obtain 
temporally similar (daily, monthly, annually and decadal temporal scale) data¬ 
sets. Moreover, mapping of IMD data and CanESM2 data-sets is also carried 
out using the temporal analysis. Figure 7.19 depicts the programmatically 
generated directories structure and transformed data-sets. 



Figure 7.19: Directory Structure after Temporal Analysis 
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7.6 Screening Predictors 


CanESM2 data-set comprises of data pertaining to twenty six parameters, 
which are given in Table 7.1. Screening of predictors has been carried out the 
using the algorithm given in Pseudo Code Listing 13 of section 6.7, so as to 
select the relevant predictors for the model generation. 


Table 7.1: Index of Climate Variables in CanESM2 Data-set 


S.No. 

Parameter 

Description 

1 

rnslpgl 

Mean Seal Level Pressure 

2 

Pl_fgl 

lOOOhPa Wind Speed 

3 

pl ugl 

lOOOhPa Zonal Wind Component 

4 

pi vgl 

lOOOhPa Meridional Wind Component 

5 

pl_zgl 

lOOOhPa Relative Vorticity of Wind 

6 

plthgl 

lOOOhPa Wind Direction 

7 

plzhgl 

lOOOhPa Divergence of True Wind 

8 

p500gl 

500hPa Geopotential 

9 

p5_fgl 

500hPa Wind Speed 

10 

p5_ugl 

500hPa Zonal Wind Component 

11 

p5_vgl 

500hPa Meridional Wind Component 

12 

p5_zgl 

500hPa Relative Vorticity of Wind 

13 

p5thgl 

500hPa Wind Direction 

14 

p5zhgl 

500hPa Divergence of True Wind 

15 

p850gl 

850hPa Geopotential 

16 

p8_fgl 

850hPa Wind Speed 

17 

p8_ugl 

850hPa Zonal Wind Component 

18 

p8_vgl 

850hPa Meridional Wind Component 

19 

p8_zgl 

850hPa Relative Vorticity of Wind 

20 

p8thgl 

850hPa Wind Direction 

21 

p8zhgl 

850hPa Divergence of True Wind 

22 

prcpgl 

Total Precipitation 

23 

s500gl 

500hPa Specific Humidity 

24 

s850gl 

850hPa Specific Humidity 

25 

shumgl 

lOOOhPa Specihc Humidity 

26 

tempgl 

Air Temperature at 2m 
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Figure 7.20: Results of Correlation Analysis for IMD Grid 009X034Y 
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Figure 7.21: Results of Correlation Analysis for IMD Grid 010X030Y 


The Figures 7.20 and 7.21 depicts the results of correlation for IMD grid 
009X034Y and 010X030Y respectively corresponding to their spatially refer¬ 
enced CanESM2 grids. In the similar way the process is applied to the all 
IMD grids, and the correlations coefficients are stored in their corresponding 
files programmatically. The Figures 7.22 and 7.23 depicts the programming 
interface and the execution time to compute the Pearson and Spearman-Rho 
statistics. The execution time of Pearson and Spearman-Rho statistics are 
1199 seconds and 624 seconds respectively. 
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Figure 7.22: Correlation Module Execution-Time with Pearson method for 
Whole India 
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char correlationResult 
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Figure 7.23: Correlation Module Execution-Time with Spearman-Rho method 
for Whole India 
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7.7 Model Generation 


The statistical downscaling models for IMD grids are programmatically gen¬ 
erated using MLR and MvLR techniques as discussed in section 6.8. EMSDM 
programmatically generates disparate mathematical models for each IMD grids 
through automating the algorithm discussed in Pseudo Code 14. The Fig¬ 
ures 7.24 and 7.25 depicts the model parameters for IMD grid 009X034Y and 
010X030Y respectively corresponding to their spatially referenced CanESM2 
grids. The model parameters and their coefficients vary from grid to grid. In 
comparison to existing statistical downscaling tools like SDSM, EMSDM au¬ 
tomates the process of model development for multiple grids in one execution 
cycle, that saves the time of the climate researchers for model generation for 
multiple grids. 
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Figure 7.24: Regression Results for IMD Grid 009X034Y 
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Figure 7.25: Regression Results for IMD Grid 010X030Y 


7.8 Time Series Analysis 

In order to demonstrate the applicability of the downscaling models generated 
using the previous process, time series is generated for the IMD grid 022X053Y 
having longitude of 77.25° and latitude of 32.75° . The grid position is shown 
in the Figure 7.27 and corresponding monthly time series of precipitation is 
shown in Figure 7.26. 



Figure 7.26: Time Series for IMD Grid 022X053Y 
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Figure 7.27: Selected IMD Grid 022X053Y for Time Series Generation 


7.9 Geovisualiazation of Results 

The downscaling results are available to analyst for geo-visualization through 
python based web-portal. Figure 7.28 depicts the downscaled information of 
climate variable (precipitation). In the same figure context menu provide the 
information about location, index and decadal precipitation as obtained from 
EMSDM for different decades. Figure 7.29 depicts the overlayed downscaled 
information of climate variable (precipitation) for India . In the same figure 
context menu provide the information about location, index and decadal pre¬ 
cipitation as obtained from EMSDM for different decades.Analyst can query 
information about the grids as shown in figure. Through the web-based portal, 
downscaling results generated by EMSDM is widely available to the decision 
makers. 
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Figure 7.28: Geo-visualization of EMSDM for India 
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Figure 7.29: Geo-visualization EMSDM Result for India, UP and MP 
























































































































































































































































































































7.10 Concluding Remarks 


In this chapter the application of the EMSDM is being carried out using 
CanESM2 GCM data-set and IMD data-set for India In order to demonstrate 
this applicability for successfully downscaling the Climate Data-set. EMSDM 
automates the different underlying processes and generates valuable structured 
spatial meta-data for GCM and local Climate data-set, that can be efficiently 
utilized for statistical downscaling of climate data for specified area of interest. 

down procedures required for implementing the EMSDM has been dis¬ 
cussed in detail. Algorithms and procedures of EMSDM are programmed in C 
language. Henceforth EMSDM can be implemented irrespective of the Oper¬ 
ating Systems viz. Windows, Unix, Linux, macOS etc and possess advantage 
of interoperability over widely used Window Based SDSM software. Moreover, 
in addition of carrying out statistical downscaling, EMSDM generates the in¬ 
teroperable spatial data for corresponding GCM and local grids. These spatial 
data can be used for different climate studies. After discussion on implemen¬ 
tation of EMSDM in this chapter, in Chapter 7, applicability of EMSDM is 
demonstrated using CanESM2 as GCM data-set, IMD as local data-set and 
India as a selected AOI. 
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Chapter 8 

Conclusions and Recommendations 

8.1 Conclusions 

Following conclusions have been drawn in the present research work: 

1. Statistical downscaling is most widely applied method for downscaling 
the climate data. Most widely used software like Statistical DownScaling 
Model (SDSM) and Long Ashton Research Station Weather Generator 
(LARS-WG) implement statistical downscaling method for downscaling 
the climate model at local scale. In these software, execution time of the 
downscaling process for single grid is approximately 30 minutes. In addi¬ 
tion, these softwares are very time inefficient for multi-site Downscaling. 
In this thesis the developed statistical downscaling (Implemented in C 
language) is very time efficient for one grid at a time as well as multi-site 
downscaling. 

2. Existing software do not provide functionality for carrying out statistical 
downscaling of large regions in a single execution. Henceforth, using 
the presently available software, significant time and manual efforts are 
required to carry to statistical downscaling in a piecewise manner. In 
this thesis, proposed model namely EMSDM is able to automate multi¬ 
site downscaling efficiently. Hence, human intervention is not required 
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for multi-site downscaling using EMSDM. 


3. Existing software do not provide functionality for carrying out statistical 
downscaling of large regions in a single execution. Henceforth, using 
the presently available software, significant time and manual efforts are 
required to carry to statistical downscaling in a piecewise manner. 

4. On basis of review of existing research works, it can be concluded that 
generalized computational downscaling model for the application in given 
specified AOI irrespective of its geographical extent is not available. 

5. In comparison to statistical downscaling techniques, dynamical downscal¬ 
ing techniques are still computationally intensive in terms of computa¬ 
tional time and also entail a considerable amount of expensive hardware. 
Henceforth, statistical downscaling techniques make them appropriate 
for uncertainty studies since it becomes feasible to perform downscaling 
for various types of climate scenarios. 

6. Monthly precipitation or temperature data are more suitable to develop 
the downscaling models since monthly downscaling models are usually 
more appropriate than the daily models in context of the predictive ca¬ 
pabilities. 

7. The salient features of EMSDM are: 

(a) Downscaling for given area of interest using different types of cli¬ 
mate data like precipitation, temperature can be carried out. 

(b) It is possible to downscale the data irrespective of geographical ex¬ 
tent. 

(c) It is scalable. It is possible to extend the framework to included 
new computational algorithm(s) for model development. 
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(d) It comprises of Web GIS component for effective geovisualization 
and querying of downscaled data in form of vector maps. Hence¬ 
forth, downscaling results are available to decision makers for fur¬ 
ther investigation. 

(e) It implements the parametric as well as non-parametric approaches 
for modal parameter selection. Henceforth, it provides the flexibility 
to filter out extreme events for model development if required by 
decision maker. 

(f) In comparison of the other approaches for carrying out downscaling 
for large regions like country, the execution time for carrying out 
downscaling for large regions is significantly less. 

(g) It provide the functionality to the generate spatial metadata grids 
for GCM and local observations. These spatial meta data data-sets 
can be assist investigators to spatio-temporally analyse the distri¬ 
bution of climatological parameters like temperature, precipitation 
etc. and further facilitate geo-statistical analysis of these parame¬ 
ters. 

(h) It can be used to carry out sensitivity analysis of GCM parameters. 
Analysts can provide their set of GCM outputs with specified order 
of their weighing to develop regression model and subsequently gen¬ 
erate time series to analyse the effect of these parameters on climate 
prediction. 

8.2 Future Recommendatation 

Based on the EMSDM, a full-scale standalone cloud based system may be 
developed and deployed so that users can utilize the system for further inves¬ 
tigation of climate change in specified area of interest. 
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